You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Glenn Marcy <gm...@us.ibm.com> on 2001/08/02 00:07:53 UTC
Re: UTF-8 parsing faster than US-ASCII
Xerces has hard-wired encoding support for UTF-8. US-ASCII, ISO-8859-1,
ISO-Latin-1, etc. are passed to the Java JDK,
for which the results may differ on different environments/platforms.
-Glenn
"Sandeep
Randhawa" To: <xe...@xml.apache.org>
<sand@glide.ne cc:
t.in> Subject: UTF-8 parsing faster than US-ASCII
08/01/2001
12:07 PM
Please respond
to
xerces-j-dev
Hi,
Somebody noticed this on Netbeans. I did a few my tests of my own and
found similar results. Is this a known issue? Very contrary to the docs.
Sandeep Randhawa
Sandeep Randhawa wrote:
>
> <?xml encoding="UTF-8" ?>
>
> If there is no specific reason to use "utf-8", stick with "us-ascii".
> Parsing is faster. Also, I noticed all of Netbeans Settings are stored
> without encoding attribute in the prolog. Xerces defaults to "utf-8" if
no
> encoding attribute is present. So for Petr Nejedly, add the attribute in
the
> prolog, we might catch a few more milliseconds.
I tried it, but with the opposite results.
I made a simple test that created a filesystem over all the modules
layers (it is a part of IDE startup sequence) and measured the time.
Then I replaced all the encoding="UTF-8" with us-ascii and added
it where it was missing and the parsing was slower then, but not much.
so I guess we could stick with using utf-8.
--
Petr Nejedly
NetBeans/Sun Microsystems
http://www.netbeans.org
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org