You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by olegabr <ol...@mailex.ru> on 2001/02/22 05:48:06 UTC

win-1251 encoding support

hi all!
I use xerces-j 1.3.0 with jdk 1.1.8
and I try to parse xml files which was written under windows-1251
encoding. And I have an error: unsupported encoding "windows-1251".
So, what can I do to avoid this problem?
                                                                     olegabr.

Re: win-1251 encoding support

Posted by Andy Clark <an...@apache.org>.
Mark Diekhans wrote:
> The IANA name is `windows-1251'; try `Cp1251', which is probably the
> alias java is using. (Cp == Code page).

I stand corrected. Thank you. For the official list, check out:

  http://www.isi.edu/in-notes/iana/assignments/character-sets

In the meantime, if you want to use Java encoding names,
then you should turn on that feature in the parser. Here's
an example:

  parser.setFeature("http://apache.org/xml/features/allow-java-encodings",
                    true);

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

Re: win-1251 encoding support

Posted by Mark Diekhans <ma...@lutris.com>.
Andy Clark <an...@apache.org> writes:
> olegabr wrote:
> > I use xerces-j 1.3.0 with jdk 1.1.8
> > and I try to parse xml files which was written under windows-1251
> > encoding. And I have an error: unsupported encoding "windows-1251".
> > So, what can I do to avoid this problem?
> 
> Use the IANA encoding name that is equivalent to the Windows
> encoding name that you are currently using. I'm not sure what
> that is, though.

The IANA name is `windows-1251'; try `Cp1251', which is probably the
alias java is using. (Cp == Code page).

Mark

Re: win-1251 encoding support

Posted by Andy Clark <an...@apache.org>.
olegabr wrote:
> I use xerces-j 1.3.0 with jdk 1.1.8
> and I try to parse xml files which was written under windows-1251
> encoding. And I have an error: unsupported encoding "windows-1251".
> So, what can I do to avoid this problem?

Use the IANA encoding name that is equivalent to the Windows
encoding name that you are currently using. I'm not sure what
that is, though.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

Re: win-1251 encoding support

Posted by jean-frederic clere <jf...@fujitsu-siemens.com>.
olegabr wrote:
> 
> hi all!
> I use xerces-j 1.3.0 with jdk 1.1.8
> and I try to parse xml files which was written under windows-1251
> encoding. And I have an error: unsupported encoding "windows-1251".
> So, what can I do to avoid this problem?
>                                                                      olegabr.

I have found at iana:
+++
Name: windows-1251
MIBenum: 2251
Source: Microsoft  (see ../character-set-info/windows-1251)
[Lazhintseva]
Alias:
+++

Try looking in: src/org/apache/xerces/readers/MIME2Java.java the xerces
supported encoding are there.

On the  machine where the data where typed in try the following test:
+++
encoding = System.getProperty("file.encoding", "8859_1");
+++
Of course 8859_1 is not the excepted answer!
Then check for this one in MIME2Java.java.

Cheers

Jean-frederic