You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Robert La Ferla <ro...@mediaone.net> on 2000/08/03 19:40:19 UTC

DOM - Parsing XML document with "Windows-1252" encoding.

Developers,

I am using the DOMParser (under Solaris) to parse a simple XML document
into memory.  Unfortunately, the XML file is in "Windows-1252"
encoding.  i.e.  The document header has <?xml version="1.0"
encoding="Windows-1252" ?> and each line in the document ends with a CR
(without LF).  If I run my code, I get the following error message:

Error in parsing: The encoding "Windows-1252" is not supported.

What's the best way to work around this?  I'd hate to have to retrieve
the document, and convert the CR/LF pairs to CR and then change the
encoding type, just to get this to work...

Sincerely,
Robert



Re: DOM - Parsing XML document with "Windows-1252" encoding.

Posted by Mike Pogue <mp...@apache.org>.
Try using the encoding name "cp1252".

http://xml.apache.org/xerces-j/faq-general.html#faq-3

Mike

Robert La Ferla wrote:
> 
> Developers,
> 
> I am using the DOMParser (under Solaris) to parse a simple XML document
> into memory.  Unfortunately, the XML file is in "Windows-1252"
> encoding.  i.e.  The document header has <?xml version="1.0"
> encoding="Windows-1252" ?> and each line in the document ends with a CR
> (without LF).  If I run my code, I get the following error message:
> 
> Error in parsing: The encoding "Windows-1252" is not supported.
> 
> What's the best way to work around this?  I'd hate to have to retrieve
> the document, and convert the CR/LF pairs to CR and then change the
> encoding type, just to get this to work...
> 
> Sincerely,
> Robert
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org

RE: DOM - Parsing XML document with "Windows-1252" encoding.

Posted by Yoga Balaji <y....@1internet.com>.
Hey Bob.
I got the same problem 3 weeks ago and I got it resolved by the following
email I rcvd from Andy Clark of XML APACHE developer group.
Solution #3 worked for me.
Hope this'll help.
YoGA

****************************************************
The problem is that "Windows-1252" is *not* a valid encoding name.
The XML specification states that all encoding names must be IANA
names. However, "Windows-1252" is not. Unfortunately, when the
Microsoft XML parser writes an XML document, it automatically
includes this encoding.

There are several solutions:

1) Convert all of the incoming documents from "Windows-1252"
   encoding to a proper encoding. Make sure to modify the
   encoding line at the top of the file to reflect the change.

2) Modify the encoding line in all of your files to be
   "Cp1252" (case is important this time). Then turn on the
   following feature in the parser:

     http://apache.org/xml/features/allow-java-encodings

   Please note that your documents won't be portable in much
   the same way that using the "Windows-1252" encoding name
   doesn't work everywhere.

3) Modify the MIME2Java.java source file to include a mapping
   for "Windows-1252" to "Cp1252". Recompile and rebuild the
   Jar file. Use the new Jar file and you're done.
******************************

-----Original Message-----
From: Robert La Ferla [mailto:robertlaferla@mediaone.net]
Sent: Thursday, August 03, 2000 12:40 PM
To: xerces-j-dev@xml.apache.org
Subject: DOM - Parsing XML document with "Windows-1252" encoding.

Developers,

I am using the DOMParser (under Solaris) to parse a simple XML document
into memory.  Unfortunately, the XML file is in "Windows-1252"
encoding.  i.e.  The document header has <?xml version="1.0"
encoding="Windows-1252" ?> and each line in the document ends with a CR
(without LF).  If I run my code, I get the following error message:

Error in parsing: The encoding "Windows-1252" is not supported.

What's the best way to work around this?  I'd hate to have to retrieve
the document, and convert the CR/LF pairs to CR and then change the
encoding type, just to get this to work...

Sincerely,
Robert



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org