You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by yo...@netpotlet.com on 2002/02/08 04:26:51 UTC

Re: pageEncoding and Jasper

From: "Craig R. McClanahan" <cr...@apache.org>
Subject: RE: pageEncoding and Jasper
Date: Thu, 31 Jan 2002 06:49:39 -0800 (PST)
Message-ID: <20...@icarus.apache.org>

> On Thu, 31 Jan 2002, Kevin Jones wrote:
> 
> > Date: Thu, 31 Jan 2002 10:37:10 -0000
> > From: Kevin Jones <ke...@develop.com>
> > Reply-To: Tomcat Developers List <to...@jakarta.apache.org>
> > To: 'Tomcat Developers List' <to...@jakarta.apache.org>
> > Subject: RE: pageEncoding and Jasper
> >
> > So it's only used when compiling the JSP to a servlet?
> 
> If "it" is the pageEncoding attribute of the <%@ page %> directive, then
> the answer is yes.

Craig, the answer is "no".
The pageEncoding attribute is used to "read" only. Servlet codes
generated from JSP pages are written in UTF-8. You can confirm
the encoding of generated Servlet codes by a browser (Netscape: 
View -> Character Coding) or a utf-8 enabled editor.
So, the encoding for compile is UTF-8 only.

---
Yoko Kamei Harada, Web Studio Ne-Po-Le
http://www.netpotlet.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: pageEncoding and Jasper

Posted by jean-frederic clere <jf...@fujitsu-siemens.com>.
yoko@netpotlet.com wrote:
> 
> From: "Craig R. McClanahan" <cr...@apache.org>
> Subject: Re: pageEncoding and Jasper
> Date: Thu, 7 Feb 2002 20:02:00 -0800 (PST)
> Message-ID: <20...@icarus.apache.org>
> 
> > > > If "it" is the pageEncoding attribute of the <%@ page %> directive, then
> > > > the answer is yes.
> > >
> > > Craig, the answer is "no".
> >
> > Well, that's not the answer given by the JSP Specification ;-).
> 
> At this point, I agree with you. So, I should say "partially no".
> There are 2 phases from .jsp to .class, the first one is from
> .jsp to .java, and the second is from .java to .class. The pageEncoding
> attribute is applied to the first phase only as the JSP Specification
> stated, and for the second phase, container applies UTF-8 only.
> 
> > In the JSP 1.2 spec, see section 2.10.1, Table JSP.2-1, bottom of page 52,
> > where the "pageEncoding" attribute is defined:
> >
> >   Defines the character encoding of the JSP page.
> 
> I've read this, and I knew the role of the pageEncoding attribute.
> Though it is convenient that .java is written by UTF-8 because .java
> file becomes completely readable,

On a EBCDIC platfrom it is not very easy to read UTF-8 or ASCII ;-))

> I believe that the unicode escape is
> more natural in Java. Besides, container doesen't need any encoding
> information from .java to .jsp if non-ascii characters are converted
> to the unocde escape. Why the JSP Specification doesn't mention about
> UTF-8 is used internally. Is this a Tomcat specific technique ?
> 
> > If you want to tell the container what character encoding to send on a
> > response, use the "contentType" attribute of the <%@ page %> directive:
> >
> >   <%@ page contentType="text/html;charset=UTF-8" %>
> 
> I appreciate this function. The appropriate encoding management is that
> every encodings in every phases including an input/output stream should
> be controllable.
> And now, JSP pages became to be a platform independant document.

If JSP are like source files they could be in the machine default encoding.

> 
> > I had to do this in my Struts demo at JavaOne Japan, for example, in
> > order to display the Japanese characters correctly.  Setting pageEncoding
> > would not have done this.
> 
> If the pageEncoding is not set, container applies encoding specifed in
> contentType attribute, isn't it ? And, if the contentType is not set or
> has no charset part, container applies ISO-8859-1 as I think UTF-8
> should be applied. UTF-8 is the most suitable encoding for a default,
> because it is a language independent encoding.

The spec's say ISO-8859-1.

> 
> ---
> Yoko Kamei Harada, Web Studio Ne-Po-Le
> http://www.netpotlet.com/
> 
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: pageEncoding and Jasper

Posted by yo...@netpotlet.com.
From: "Craig R. McClanahan" <cr...@apache.org>
Subject: Re: pageEncoding and Jasper
Date: Thu, 7 Feb 2002 20:02:00 -0800 (PST)
Message-ID: <20...@icarus.apache.org>

> > > If "it" is the pageEncoding attribute of the <%@ page %> directive, then
> > > the answer is yes.
> >
> > Craig, the answer is "no".
> 
> Well, that's not the answer given by the JSP Specification ;-).

At this point, I agree with you. So, I should say "partially no".
There are 2 phases from .jsp to .class, the first one is from 
.jsp to .java, and the second is from .java to .class. The pageEncoding
attribute is applied to the first phase only as the JSP Specification
stated, and for the second phase, container applies UTF-8 only.

> In the JSP 1.2 spec, see section 2.10.1, Table JSP.2-1, bottom of page 52,
> where the "pageEncoding" attribute is defined:
> 
>   Defines the character encoding of the JSP page.

I've read this, and I knew the role of the pageEncoding attribute.
Though it is convenient that .java is written by UTF-8 because .java
file becomes completely readable, I believe that the unicode escape is
more natural in Java. Besides, container doesen't need any encoding
information from .java to .jsp if non-ascii characters are converted
to the unocde escape. Why the JSP Specification doesn't mention about
UTF-8 is used internally. Is this a Tomcat specific technique ?

> If you want to tell the container what character encoding to send on a
> response, use the "contentType" attribute of the <%@ page %> directive:
> 
>   <%@ page contentType="text/html;charset=UTF-8" %>

I appreciate this function. The appropriate encoding management is that
every encodings in every phases including an input/output stream should
be controllable. 
And now, JSP pages became to be a platform independant document.

> I had to do this in my Struts demo at JavaOne Japan, for example, in
> order to display the Japanese characters correctly.  Setting pageEncoding
> would not have done this.

If the pageEncoding is not set, container applies encoding specifed in
contentType attribute, isn't it ? And, if the contentType is not set or
has no charset part, container applies ISO-8859-1 as I think UTF-8 
should be applied. UTF-8 is the most suitable encoding for a default,
because it is a language independent encoding.

---
Yoko Kamei Harada, Web Studio Ne-Po-Le
http://www.netpotlet.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: pageEncoding and Jasper

Posted by Bill Barker <wb...@wilshire.com>.
----- Original Message -----
From: <yo...@netpotlet.com>
To: <to...@jakarta.apache.org>
Sent: Thursday, February 07, 2002 7:26 PM
Subject: Re: pageEncoding and Jasper


> From: "Craig R. McClanahan" <cr...@apache.org>
> Subject: RE: pageEncoding and Jasper
> Date: Thu, 31 Jan 2002 06:49:39 -0800 (PST)
> Message-ID: <20...@icarus.apache.org>
>
> > On Thu, 31 Jan 2002, Kevin Jones wrote:
> >
> > > Date: Thu, 31 Jan 2002 10:37:10 -0000
> > > From: Kevin Jones <ke...@develop.com>
> > > Reply-To: Tomcat Developers List <to...@jakarta.apache.org>
> > > To: 'Tomcat Developers List' <to...@jakarta.apache.org>
> > > Subject: RE: pageEncoding and Jasper
> > >
> > > So it's only used when compiling the JSP to a servlet?
> >
> > If "it" is the pageEncoding attribute of the <%@ page %> directive, then
> > the answer is yes.
>
> Craig, the answer is "no".
I'd like to think that Craig knows what Tomcat 4.x does ;-)

> The pageEncoding attribute is used to "read" only. Servlet codes
Which is what he said, if you followed the rest of the thread.

> generated from JSP pages are written in UTF-8. You can confirm
> the encoding of generated Servlet codes by a browser (Netscape:
> View -> Character Coding) or a utf-8 enabled editor.
This is controlled by the contentType attribute of the <%@page
...%>directive

> So, the encoding for compile is UTF-8 only.
>
It is true that Jasper (in both 3.x and 4.x branches) generates (by default)
the intermediate .java file as UTF-8 (which causes big problems when you
configure the system to use Jikes :(.
> ---
> Yoko Kamei Harada, Web Studio Ne-Po-Le
> http://www.netpotlet.com/
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: pageEncoding and Jasper

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Fri, 8 Feb 2002 yoko@netpotlet.com wrote:

> Date: Fri, 08 Feb 2002 12:26:51 +0900 (JST)
> From: yoko@netpotlet.com
> Reply-To: Tomcat Developers List <to...@jakarta.apache.org>
> To: tomcat-dev@jakarta.apache.org
> Subject: Re: pageEncoding and Jasper
>
> From: "Craig R. McClanahan" <cr...@apache.org>
> Subject: RE: pageEncoding and Jasper
> Date: Thu, 31 Jan 2002 06:49:39 -0800 (PST)
> Message-ID: <20...@icarus.apache.org>
>
> > On Thu, 31 Jan 2002, Kevin Jones wrote:
> >
> > > Date: Thu, 31 Jan 2002 10:37:10 -0000
> > > From: Kevin Jones <ke...@develop.com>
> > > Reply-To: Tomcat Developers List <to...@jakarta.apache.org>
> > > To: 'Tomcat Developers List' <to...@jakarta.apache.org>
> > > Subject: RE: pageEncoding and Jasper
> > >
> > > So it's only used when compiling the JSP to a servlet?
> >
> > If "it" is the pageEncoding attribute of the <%@ page %> directive, then
> > the answer is yes.
>
> Craig, the answer is "no".

Well, that's not the answer given by the JSP Specification ;-).

In the JSP 1.2 spec, see section 2.10.1, Table JSP.2-1, bottom of page 52,
where the "pageEncoding" attribute is defined:

  Defines the character encoding of the JSP page.

There is further text in Chapter 3 making it clear that this affects the
compiler only.  It has zero impact on what is sent to a browser as part of
an HTTP response.

> The pageEncoding attribute is used to "read" only.

Technically this is true, but remember that the source of the JSP page is
read only once, by the JSP compiler, so the answer to "it's only used when
compiling the JSP to a servlet" is correct.

> Servlet codes
> generated from JSP pages are written in UTF-8. You can confirm
> the encoding of generated Servlet codes by a browser (Netscape:
> View -> Character Coding) or a utf-8 enabled editor.
> So, the encoding for compile is UTF-8 only.
>

If you want to tell the container what character encoding to send on a
response, use the "contentType" attribute of the <%@ page %> directive:

  <%@ page contentType="text/html;charset=UTF-8" %>

I had to do this in my Struts demo at JavaOne Japan, for example, in
order to display the Japanese characters correctly.  Setting pageEncoding
would not have done this.

> ---
> Yoko Kamei Harada, Web Studio Ne-Po-Le
> http://www.netpotlet.com/
>

Craig


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>