You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@axis.apache.org by Din%$h <xy...@gmail.com> on 2005/03/22 07:17:36 UTC

identifying utf-16 encoded characters

Hi all,

          I'm trying to get encoding support for tspp parser. can
someone plz tell me. How we can identify utf-16 encoded characters. I
mean after identifing BOM(Byte Order Mark) want to
identify wheather next character is 2 bytes or 4 bytes.

cheers,
Dinesh
-- 
W.Dinesh Premalal
+94-773-034326
premalwd@cse.mrt.ac.lk
http://www.cse.mrt.ac.lk/~premalwd/

Re: identifying utf-16 encoded characters

Posted by John Hawkins <ha...@uk.ibm.com>.
Have you looked at ICU4C which has lots of help in this area?

We are looking at using this product in 1.6 and beyond to help us in this 
area.





Din%$h <xy...@gmail.com> 
22/03/2005 06:17
Please respond to
"Apache AXIS C Developers List"


To
Apache AXIS C Developers List <ax...@ws.apache.org>
cc

Subject
identifying utf-16 encoded characters






Hi all,

          I'm trying to get encoding support for tspp parser. can
someone plz tell me. How we can identify utf-16 encoded characters. I
mean after identifing BOM(Byte Order Mark) want to
identify wheather next character is 2 bytes or 4 bytes.

cheers,
Dinesh
-- 
W.Dinesh Premalal
+94-773-034326
premalwd@cse.mrt.ac.lk
http://www.cse.mrt.ac.lk/~premalwd/