You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Glenn Marcy <gm...@us.ibm.com> on 2002/05/25 02:42:17 UTC

Re: Java encoding names

Actually, the XML spec doesn't require that a processor recognize
anything other than UTF-8 and UTF-16, so we are being quite nice
in supporting the IANA registered names.  It is also very helpful
of us to provide an option that supports the use of Java encoding
names at all.  Or at least I would think so considering that there
is nothing in the XML spec that mentions parsers are implemented
in Java.  The whole point of registering names with IANA is so that
there is agreement on what the names of the encodings happen to be.

I think that promoting *by default* the use of encoding names that
are a property of which JDK you are using is a bad situation when
it comes to document interoperability.  Hey, isn't it great that
the business data for my company is in XML...  oh, you mean that
you cannot actually parse any of my documents because I am using
a non-interoperable encoding name for all my data?

Your point about encodings being provided externally seems to be
completely orthogonal to this issue.  Saying that the following
data is written using the "foobar" encoding as the first line of
a letter or putting it on the outside of the envelope containing
that letter makes little difference if there is no agreement of
what "foobar" means.

To be fair, I have nothing against people using whatever encodings
make sense for them.  However, I would prefer not to get into the
game of creating "de facto" standards, i.e. "Hey, can you fix your
parser?  Since Xerces accepts my 'foobar' encoded documents your
parser must be broken if you don't accept them as well."  There is
a reason that encodings names are registered with IANA in the first
place.

Lastly, since you can always wrap an InputStreamReader around your
byte streams using whatever ByteToCharConverter the JDK is willing
to give you for whatever encoding names you are willing to accept,
I do not see how the current Xerces behavior is a problem when you
claim to know the encoding externally.

Regards,
Glenn




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: Java encoding names

Posted by Andy Clark <an...@apache.org>.
I agree with Glenn. I think the allow-Java-encoding-names
feature should remain OFF by default.

-- 
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org