You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Michael Schuerig <mi...@schuerig.de> on 2004/09/08 16:07:26 UTC
JDT-Compiler character encoding
I've tried the following for combinations of settings, where
jspx denotes the encoding declared and used in my jspx document,
jsp-javaEncoding is declared in conf/web.xml, and jasper-out is the
relevant line in the generated xxx_jspx.java.
(1)
jspx: ISO-8859-1
jsp-javaEncoding: not explicitly set
jasper-out:
out.write("\tÀöÌÃ<84>Ã<96>Ã<9C>Ã<9F>\n");
(2)
jspx: UTF-8
jsp-javaEncoding: not explicitly set
jasper-out:
out.write("\tÀöÌÃ<84>Ã<96>Ã<9C>Ã<9F>\n");
(3)
jspx: ISO-8859-1
jsp-javaEncoding: ISO-8859-1
jasper-out:
out.write("\täöüÄÖÜß\n");
(4)
jspx: UTF-8
jsp-javaEncoding: ISO-8859-1
jasper-out:
out.write("\täöüÄÖÜß\n");
Only (3) and (4) appear correctly in the browser as "äöüÄÖÜß" (german
umlauts). I don't think setting the javaEncoding should be necessary
here, but I may well be misunderstanding something.
Without any javaEncoding given, jasper produces UTF-8 encoded java
source code and the JDT compiler supposedly accepts UTF-8 as its
default input encoding. I haven't verified the latter.
There seem to be two possible causes for the incorrect output
the JDT compiler doesn't behave as advertised, i.e., it does not take
UTF-8 as default input encoding. *Or* the JDT compiler produces
character output in UTF-8 which is latter erroneously treated as
ISO-8859-1.
Michael
--
Michael Schuerig Contests between male toads over females are
mailto:michael@schuerig.de often settled by the depth of the croak.
http://www.schuerig.de/michael/ --John Maynard Smith
---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
Re: JDT-Compiler character encoding
Posted by Michael Schuerig <mi...@schuerig.de>.
On Wednesday 08 September 2004 16:07, Michael Schuerig wrote:
> There seem to be two possible causes for the incorrect output
>
> the JDT compiler doesn't behave as advertised, i.e., it does not take
> UTF-8 as default input encoding. *Or* the JDT compiler produces
> character output in UTF-8 which is latter erroneously treated as
> ISO-8859-1.
Precompiled with Ant javac, encoding="UTF-8":
java:
out.write("\n\n TEST\n
\n\tÀöÌÃ<84>Ã<96>Ã<9C>Ã<9F>\n\t\n\t");
decompiled class:
out.write("\n\n TEST\n
\n\t\344\366\374\304\326\334\337\n\t\n\t");
Server compiled (without javaEncoding set in web.xml):
java:
out.write("\tÀöÌÃ<84>Ã<96>Ã<9C>Ã<9F>\n");
decompiled class:
out.write("\t\303\u20AC\303\266\303\u0152\303\204\303\226\303\234\303\237\n");
Server compiled (with javaEncoding ISO-8859-1 set in web.xml):
java:
out.write("\täöüÄÖÜß\n");
decompiled class:
out.write("\t\344\366\374\304\326\334\337\n");
Something's amiss here. Apparently, by default the JDT compiler does not
take UTF-8 input correctly, rather it seems to expect ISO-8859-1.
Now, is this a bug or am I misunderstanding something?
Michael
--
Michael Schuerig Nothing is as brilliantly adaptive
mailto:michael@schuerig.de as selective stupidity.
http://www.schuerig.de/michael/ --A.O. Rorty, The Deceptive Self
---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org