You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Praveen Peddi <pp...@contextmedia.com> on 2004/07/13 17:35:57 UTC
Best way to read non-utf xml documents
I have input xml files in "windows-1252" encoding and I have to convert these into utf-8 format and send to server (server assumes that all input xml files are utf-8 encoded). When I read the files and output in utf-8 encoding, I am losing some special characters like registered marks, copy right etc.
I am reading the file in OS native encoding and outputting in utf-8 encoding (by not specifying any encoding for input stream).
Whats the best way to read non-utf8 encoded xml files and output in utf-8 encoding.
Any help would be appreciated...
Thanks
Praveen
**************************************************************
Praveen Peddi
Sr Software Engg, Context Media, Inc.
email:ppeddi@contextmedia.com
Tel: 401.854.3475
Fax: 401.861.3596
web: http://www.contextmedia.com
**************************************************************
Context Media- "The Leader in Enterprise Content Integration"
Re: Best way to read non-utf xml documents
Posted by Suresh Babu Koya <sk...@in-reality.com>.
The streams are encoded based on the System property "file.encoding".
Use the Xerces serializer API to compose the XML and set the encoding.
/Suresh
----- Original Message -----
From: Praveen Peddi
To: xerces-j-user@xml.apache.org
Sent: Tuesday, July 13, 2004 9:05 PM
Subject: Best way to read non-utf xml documents
I have input xml files in "windows-1252" encoding and I have to convert these into utf-8 format and send to server (server assumes that all input xml files are utf-8 encoded). When I read the files and output in utf-8 encoding, I am losing some special characters like registered marks, copy right etc.
I am reading the file in OS native encoding and outputting in utf-8 encoding (by not specifying any encoding for input stream).
Whats the best way to read non-utf8 encoded xml files and output in utf-8 encoding.
Any help would be appreciated...
Thanks
Praveen
**************************************************************
Praveen Peddi
Sr Software Engg, Context Media, Inc.
email:ppeddi@contextmedia.com
Tel: 401.854.3475
Fax: 401.861.3596
web: http://www.contextmedia.com
**************************************************************
Context Media- "The Leader in Enterprise Content Integration"