You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by "Craig R. McClanahan" <cr...@apache.org> on 2001/12/03 09:02:38 UTC

Re: Jsp compile option for Big5 encoding / encoding question


On Mon, 3 Dec 2001, Jim Cheesman wrote:

> Date: Mon, 03 Dec 2001 09:11:56 +0100
> From: Jim Cheesman <jc...@msl.es>
> Reply-To: Tomcat Users List <to...@jakarta.apache.org>
> To: Tomcat Users List <to...@jakarta.apache.org>
> Subject: Re: Jsp compile option for Big5 encoding / encoding question
>
> At 01:53 AM 03/12/01, you wrote:
> >This default is required by the Servlet and JSP specifications, so it is
> >not configurable.
>
>
> So why did they decide against unicode??? Shouldn't that be the standard?
>

Does your browser support Unicode directly?  Do you really want to pay
16-bit output (or potentially more, for future versions of Unicode)
overhead for every single character?

>
> Quick question (vaguely off topic?) re. encoding:
>
> The situation:
> We have a database (at the moment either SQL Server 2000 or DB2) with
> non-ascii data included (Spanish and French characters, mainly). We have to
> serve XML pages from this, and are using jsp and tomcat 4.0 to do so. The
> pages generate correctly, but are not visible using M$ IE 5.x/6.0 if the
> encoding is set to UTF-8. If we set the encoding to ISO8859 it all works fine.
>
> Why? Is this a problem with IE5? Or what?
>

Pages must be actually rendered in the character encoding you say that
they are encoded with.  For your purposes, that means that you must either
encode the characters (in your database) in the same character set that
you set on output, or you must perform the appropriate conversions in your
application.

In addition, your browser (or OS, depending on platform) must be
configured to support the characters you are interested in.  For example,
I needed to support Japanese output characters (with a UTF-8 character
encoding).  This still didn't work on my Windows laptop until I had
downloaded and installed the Japanese font, and configured my browser
appropriately.

> (And out of interest, what encoding is used in each stage?)
>

JSP pages follow these rules:

* If you declare a "pageEncoding" attribute on your <%@ page %>
  directive (supported in JSP 1.2 only), that character set is used
  to read the text of the page itself (as the page is being compiled).

* The character encoding from the contentType attribute of the
  <%@ page %> directive is used, if present

* If neither of the above is set, ISO-8859-1 is assumed.

For servlets, the rules are slightly simpler:

* If you specify a character encoding in the setContentType() setting,
  it is used.

* Otherwise, ISO-8859-1 is assumed.


> Jim
>

Craig McClanahan



--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>


Re: Jsp compile option for Big5 encoding / encoding question

Posted by Jim Cheesman <jc...@msl.es>.
At 09:02 AM 03/12/01, you wrote:

And thanks for the (mostly) snipped answer.




> > At 01:53 AM 03/12/01, you wrote:
> > >This default is required by the Servlet and JSP specifications, so it is
> > >not configurable.
> >
> >
> > So why did they decide against unicode??? Shouldn't that be the standard?
> >
>
>Does your browser support Unicode directly?  Do you really want to pay
>16-bit output (or potentially more, for future versions of Unicode)
>overhead for every single character?


Yes, I would like to pay that. I already pay a 7/8 bit overhead, and a 
supposedly machine independent language would be well served by further 
internationalisation. I already accept that java is not as fast as, say, C, 
so what's the problem? Set the default as unicode, and then allow people to 
downgrade to a subset if the performance overhead is critical.









--

                           *   Jim Cheesman   *
             Trabajo: 
jchees@msl.es - (34)(91) 724 9200 x 2360
           Help stamp out and 
eradicate superfluous redundancy



--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>