You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Dean Roddey <dr...@charmedquark.com> on 2000/05/15 05:39:11 UTC

Re: Best places for newsgroups?

This is the appropriate place for that kind of discussion. What do you want
to know about exactly? Also, when you post questions, give your platform,
compiler, parser, etc... versions so that its known what you are dealing
with.

--------------------------
Dean Roddey
The CIDLib Class Libraries
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"Give me immortality, or give me death"

----- Original Message -----
> I'm just getting started w/ Xerces (IBM's XML4C actually) and have some
questions about transcoding back and forth between UTF-8 and XMLCh.
>
> Can someone point me to the best news groups to post my questions?
>
> I tried news.alphaworks.ibm.com, but the news server won't seem to take
any posts from me and the latest posts seem to be from late March. So I
conclude they've shut it down.
>



Re: Newbie Questions on UTF-8 Transcoding

Posted by Dean Roddey <dr...@charmedquark.com>.
The date for the new release is up to Andy H. and crew. I'm now just an
outside observer since I've left IBM, so I don't know that kind of info.

--------------------------
Dean Roddey
The CIDLib Class Libraries
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"Give me immortality, or give me death"

----- Original Message -----
From: "Calvin S. Powers" <po...@attglobal.net>
To: <xe...@xml.apache.org>
Sent: Monday, May 15, 2000 6:32 AM
Subject: Re: Newbie Questions on UTF-8 Transcoding


> For going the other direction, I'll try to dig through the current code
trees and maybe extract out the logic I need. My problem is that I have to
ship my code in just a few weeks. Any idea when the next release with the
bi-directional transcoders will be released?
>
> It seems that going from XMLCh to UTF8 should be pretty straightforward,
so I'm going to try to write my own if I can find an accurate, yet easy to
read description of the algorithm. :-)
>



Re: Newbie Questions on UTF-8 Transcoding

Posted by "Calvin S. Powers" <po...@attglobal.net>.
Dean, 

Thanks for the info. 

My code to build a UTF-8 transcoder looks almost like yours except that I used the predefined strings out of the Uni class to specify the UTF8 name on the input to the makeNewTranscoderFor. So I feel like I'm on the right track for going from UTF-8 to XMLCh. 

For going the other direction, I'll try to dig through the current code trees and maybe extract out the logic I need. My problem is that I have to ship my code in just a few weeks. Any idea when the next release with the bi-directional transcoders will be released?

It seems that going from XMLCh to UTF8 should be pretty straightforward, so I'm going to try to write my own if I can find an accurate, yet easy to read description of the algorithm. :-) 

--csp

 

At 09:55 PM 5/14/00 -0700, you wrote:
>This functionality is now in place, but its not in the current 1.1.0/3.1.0
>release. It'll be available in the next release, which should be pretty
>close to happening. Basically, the XML transcoders will transcode both
>directions now. The LCP (local code page) transcoders only work on the local
>code page, so you've found the right classes to use. You just need to wait
>for the next release, or work with the current code until then.
>
>Actually, to be pendantic, you should create the UTF-8 transcoder directly.
>You should use the TransService object, which is pointed to by the
>XMLPlatformUtils::fgTransService after you've initialized. It has a method
>to create new transcoders by name. So it would look something like:
>
>XMLTransServer::Codes resCode;
>XMLTranscoder* pXcoder =
>   XMLPlatformUtils::fgTransService->makeNewTranscoderFor
>   (
>       L"UTF-8"
>       , resCode
>       , 4096
>   );
>
>This will create a transcoder for you. The last parameter indicates the
>largest buffer you'll ask it to transcode at a time. This allows the
>transcoders to efficiently pre-allocate any internal data structures it
>needs in order to do the work. The resCode will, if it fails and returns a
>null, kind of give you a hint as to why it hosed.
>
>--------------------------
>Dean Roddey
>The CIDLib Class Libraries
>Charmed Quark Software
>droddey@charmedquark.com
>http://www.charmedquark.com
>
>"Give me immortality, or give me death"
>
========================================================================
Calvin S. Powers                                          current events
mailto:powers@attglobal.net                           cultural phenomena
http://www.sff.net/people/powers                            true stories
"cannon fodder in the culture war"        http://www.StuckInTraffic.com/
========================================================================

Re: Newbie Questions on UTF-8 Transcoding

Posted by Dean Roddey <dr...@charmedquark.com>.
This functionality is now in place, but its not in the current 1.1.0/3.1.0
release. It'll be available in the next release, which should be pretty
close to happening. Basically, the XML transcoders will transcode both
directions now. The LCP (local code page) transcoders only work on the local
code page, so you've found the right classes to use. You just need to wait
for the next release, or work with the current code until then.

Actually, to be pendantic, you should create the UTF-8 transcoder directly.
You should use the TransService object, which is pointed to by the
XMLPlatformUtils::fgTransService after you've initialized. It has a method
to create new transcoders by name. So it would look something like:

XMLTransServer::Codes resCode;
XMLTranscoder* pXcoder =
   XMLPlatformUtils::fgTransService->makeNewTranscoderFor
   (
       L"UTF-8"
       , resCode
       , 4096
   );

This will create a transcoder for you. The last parameter indicates the
largest buffer you'll ask it to transcode at a time. This allows the
transcoders to efficiently pre-allocate any internal data structures it
needs in order to do the work. The resCode will, if it fails and returns a
null, kind of give you a hint as to why it hosed.

--------------------------
Dean Roddey
The CIDLib Class Libraries
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"Give me immortality, or give me death"

----- Original Message -----
From: "Calvin S. Powers" <po...@attglobal.net>
To: <xe...@xml.apache.org>
Sent: Sunday, May 14, 2000 11:49 PM
Subject: Newbie Questions on UTF-8 Transcoding


> Greetings,
>
> I've been putting together some code based on the XML4C parser and I'm
still figuring out how all the classes hang together. My biggest problem
right now is with UTF-8 encoding.
>
> It seems that the XML4C classes are geared for going back and forth
between the operating systems local code page and XMLCh. (i.e. 16 bit
Unicode.)
>
> But in my application, I've got data that's UTF8 encoded and I need to
insert it into a DOM Text node (in some cases) and (in some cases) just
compare it to the value of existig TEXT nodes.
>
> Well, by digging through the source I could puzzle a way to create my own
XMLUTF8Transcoder object and get the data from UTF8 into a string of XMLCh.
(Though I haven't tested it yet :-)
>
> But what I can't figure out how to do is go from a string of XMLCh
characters into a UTF8 encoded string.
>
> I keep thinking that if I fiddle with XMLLCPTranscoder enough I'll figure
it out, but so far, no luck.
>
> Can anyone clue me in on how to go from a XMLCh string to a UTF8 string?
Double bonus points for some example code to look at.
>
> After reading through the alphaworks website some, I'm starting to get the
impression that going from 16 bit unicode to UTF-8 is not something that the
package covers, in which case, my question boils down to Help! Where can I
find some example code to study So I can write my own!
>
> A Thousand Thanks In Advance!
>
> ========================================================================
> Calvin S. Powers                                          current events
> mailto:powers@attglobal.net                           cultural phenomena
> http://www.sff.net/people/powers                            true stories
> "cannon fodder in the culture war"        http://www.StuckInTraffic.com/
> ========================================================================
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>


Newbie Questions on UTF-8 Transcoding

Posted by "Calvin S. Powers" <po...@attglobal.net>.
Greetings, 

I've been putting together some code based on the XML4C parser and I'm still figuring out how all the classes hang together. My biggest problem right now is with UTF-8 encoding. 

It seems that the XML4C classes are geared for going back and forth between the operating systems local code page and XMLCh. (i.e. 16 bit Unicode.)

But in my application, I've got data that's UTF8 encoded and I need to insert it into a DOM Text node (in some cases) and (in some cases) just compare it to the value of existig TEXT nodes. 

Well, by digging through the source I could puzzle a way to create my own XMLUTF8Transcoder object and get the data from UTF8 into a string of XMLCh. (Though I haven't tested it yet :-) 

But what I can't figure out how to do is go from a string of XMLCh characters into a UTF8 encoded string. 

I keep thinking that if I fiddle with XMLLCPTranscoder enough I'll figure it out, but so far, no luck. 

Can anyone clue me in on how to go from a XMLCh string to a UTF8 string? Double bonus points for some example code to look at. 

After reading through the alphaworks website some, I'm starting to get the impression that going from 16 bit unicode to UTF-8 is not something that the package covers, in which case, my question boils down to Help! Where can I find some example code to study So I can write my own! 

A Thousand Thanks In Advance!

========================================================================
Calvin S. Powers                                          current events
mailto:powers@attglobal.net                           cultural phenomena
http://www.sff.net/people/powers                            true stories
"cannon fodder in the culture war"        http://www.StuckInTraffic.com/
========================================================================