You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by jinesh kj <ji...@gmail.com> on 2007/12/04 07:35:08 UTC

converting XMLCh* in unicode to string

hi,

I have some doubts in this regard. I am having a program which reads the
text content of an XML element by getTextContent and i want to transcode it
to unicode. I have a simple doubt out of topic, like can i store the string
in std::string as other non unicode strings and use functions like find and
all?(basically i need to read the string word by word). Also
XMLString::transcode will transcode to native encoding i believe. If I use a
std::wstring, can i solve the problem?

regards

Jinesh K J

P.S: sorry, i know the question is more related to c++, if anyone know the
answers please  help me.

-- 
My Feelings,Expressions-
http://logbookofanobserver.blogspot.com

SMC : My computer, My language http://smc.org.in
സ്വതന്ത്ര മലയാളം കമ്പ്യൂട്ടിങ്ങ്, എന്റെ കമ്പ്യൂട്ടറിന് എന്റെ ഭാഷ

Re: converting XMLCh* in unicode to string

Posted by jinesh kj <ji...@gmail.com>.
Thank you for your reply dave. Can you please direct me to the discussion? I
checked the archives but can't identify the specific discussion.

regards

Jinesh K J

On 12/4/07, David Bertoni <db...@apache.org> wrote:
>
> jinesh kj wrote:
> > hi,
> >
> > I have some doubts in this regard. I am having a program which reads the
> > text content of an XML element by getTextContent and i want to transcode
> it
> > to unicode.
>
> String data in Xerces-C is encoded in UTF-16, so it's already Unicode
> data.
>
>   I have a simple doubt out of topic, like can i store the string
> > in std::string as other non unicode strings and use functions like find
> and
> > all?(basically i need to read the string word by word). Also
> > XMLString::transcode will transcode to native encoding i believe.
>
> Yes, XMLString::transcode() will transcode a string to the local code
> page,
> so it's not very helpful if you care about the fidelity of the data.  You
> can transcode UTF-16 to UTF-8 and store the UTF-8 bytes in std::string,
> but
> you may not be able to use all of std::string's member functions.  There
> was a very recent thread on the mailing list about this, so you should
> review the recent postings to the list.
>
> > If I use a std::wstring, can i solve the problem?
>
> Unlikely, unless your platform support UTF-16 for the wchar_t data type.
> Windows is the only platform that does this consistently.  Otherwise, you
> will probably have to fall back to std::vector<XMLCh>.
>
> Dave
>



-- 
My Feelings,Expressions-
http://logbookofanobserver.blogspot.com

SMC : My computer, My language http://smc.org.in
സ്വതന്ത്ര മലയാളം കമ്പ്യൂട്ടിങ്ങ്, എന്റെ കമ്പ്യൂട്ടറിന് എന്റെ ഭാഷ

Re: converting XMLCh* in unicode to string

Posted by David Bertoni <db...@apache.org>.
jinesh kj wrote:
> hi,
> 
> I have some doubts in this regard. I am having a program which reads the
> text content of an XML element by getTextContent and i want to transcode it
> to unicode.

String data in Xerces-C is encoded in UTF-16, so it's already Unicode data.

  I have a simple doubt out of topic, like can i store the string
> in std::string as other non unicode strings and use functions like find and
> all?(basically i need to read the string word by word). Also
> XMLString::transcode will transcode to native encoding i believe.

Yes, XMLString::transcode() will transcode a string to the local code page, 
so it's not very helpful if you care about the fidelity of the data.  You 
can transcode UTF-16 to UTF-8 and store the UTF-8 bytes in std::string, but 
you may not be able to use all of std::string's member functions.  There 
was a very recent thread on the mailing list about this, so you should 
review the recent postings to the list.

> If I use a std::wstring, can i solve the problem?

Unlikely, unless your platform support UTF-16 for the wchar_t data type. 
Windows is the only platform that does this consistently.  Otherwise, you 
will probably have to fall back to std::vector<XMLCh>.

Dave