You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Mark Lundquist <ml...@wrinkledog.com> on 2004/08/22 00:20:21 UTC
container-encoding vs. form-encoding... bug?
Hi,
I'm using Cocoon 2.1.5.1 w/ Jetty 4.2.15. xalan was throwing a
SAXException trying to write a character (U2026, &hellip) that's not
reppresentable "in the specified output encoding iso-8859-1".
I made sure I had <xml:output encoding="UTF-8"> everywhere, but the
problem persisted. Finally I figured out that I needed to check the
encoding parameters in web.xml. Sure enough, container-encoding and
form-encoding were not set, and the comments indicate that they default
to iso-8859-1.
So I set the container-encoding to UTF-8, and that didn't have any
effect. Only when I set form-encoding to UTF-8 did my problem go away.
The thing is, the character that was causing the problem isn't coming
from the request! I expected container-encoding to be the one that
would effect the behavior I was seeing.
So, am I just not understanding something correctly? Or is it a bug,
and if so is it a problem with Cocoon or with Jetty?
Cheers,
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: container-encoding vs. form-encoding... bug?
Posted by Marc Portier <mp...@outerthought.org>.
this wiki article should explain everything:
http://wiki.apache.org/cocoon/RequestParameterEncoding
Mark Lundquist wrote:
> Hi,
>
> I'm using Cocoon 2.1.5.1 w/ Jetty 4.2.15. xalan was throwing a
> SAXException trying to write a character (U2026, &hellip) that's not
> reppresentable "in the specified output encoding iso-8859-1".
>
probably somewhere in the serializer
> I made sure I had <xml:output encoding="UTF-8"> everywhere, but the
to no avail (and I assume you wanted to type <xsl:output ...> )
this directive is used by xalan if the 'xalan engine' is operating in a
mode where it needs to transform AND serialize
cocoon however (having it's reasons to separate the two operations) will
override this line in the xsl anyway... for cocoon the end result of a
transformer needs to be pure sax-events that will be piped through a
serializer later on
Since cocoon overrides that anyway you should use the <xsl:output ...>
in your stylesheets to ease your debugging work so you can see the
output of your stylesheet in your favourite encoding (and whatnot output
params).
(for API geeks, see:
http://java.sun.com/j2se/1.4.2/docs/api/javax/xml/transform/Transformer.html#setOutputProperties(java.util.Properties)
> problem persisted. Finally I figured out that I needed to check the
> encoding parameters in web.xml. Sure enough, container-encoding and
> form-encoding were not set, and the comments indicate that they default
> to iso-8859-1.
>
> So I set the container-encoding to UTF-8, and that didn't have any
> effect. Only when I set form-encoding to UTF-8 did my problem go away.
container-encoding should be set to the encoding your chosen container
(jetty) is using to decode (the body of) HTTP-requests
most container take iso-8859-1 here, so you should just leave it unless
you know about your container doin' it differently
recent post learned that Jetty will allow you to set it yourself by
specifying a system property -Dorg.mortbay.util.URI.charset=UTF-8
(see: http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=109273705513761&w=2 )
so only when playing with this, you should be getting into changing the
container encoding in the web.xml
> The thing is, the character that was causing the problem isn't coming
> from the request! I expected container-encoding to be the one that
> would effect the behavior I was seeing.
>
as you found out by now
container-encoding setting only comes into play when HTTP-request's body
is read in some way
> So, am I just not understanding something correctly? Or is it a bug,
> and if so is it a problem with Cocoon or with Jetty?
>
what really needs to happen in this story is telling the SERIALIZER in
cocoon about what encoding to use
it's quite logic: the <xsl:output ...> directive is overriden from the
transformer part, so we need to inject that info back again, since this
is about the serialization part of things you should give that info to
the serializer. So how do you do that?
1/ You do that on a local level (one serializer) by applying the hints
Jan just gave in his post. (setting map:serializer/@mime-type and
./encoding)
2/ You do that on a global level (default for all text-serializers) by
doing what you did: setting the form-encoding in web.xml.
Historically that setting comes into play also in the area of
request-paramaters. However there is a 'bug' (well, maybe rather a
'historic way of interpreting' the specs) in most browsers that will
make them apply the same form-encoding to their requests as the one
applicable to the form asking for the reques-parameters. Because of
this client-side coupling we opted to make the applied form-encoding
also be the default for our serializers.
HTH,
-marc=
--
Marc Portier http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at http://blogs.cocoondev.org/mpo/
mpo@outerthought.org mpo@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: container-encoding vs. form-encoding... bug?
Posted by Jan Hoskens <jh...@schaubroeck.be>.
You should set the container-encoding to ISO-8859-1 and leave the
form-encoding as UTF-8. If I remember correctly, the container-encoding
is a thing introduced with servlet api 2.3 while cocoon was coping with
2.2 . The latter did pass everything in ISO and cocoon expects it to be
ISO (will change probably in later cocoon versions). Remember to set
your encoding in your serializers too (two places to look for! The
element "encoding" AND attribute "mime-type" ):
<map:serializer logger="sitemap.serializer.xhtml" mime-type="text/html;
charset=UTF-8" name="xhtml" pool-grow="2" pool-max="64" pool-min="10"
src="org.apache.cocoon.serialization.XMLSerializer">
<doctype-public>-//W3C//DTD XHTML 1.0
Strict//EN</doctype-public>
<doctype-system>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</doctype-system>
<encoding>utf-8</encoding>
</map:serializer>
Kind regards,
Jan
Mark Lundquist wrote:
> Hi,
>
> I'm using Cocoon 2.1.5.1 w/ Jetty 4.2.15. xalan was throwing a
> SAXException trying to write a character (U2026, &hellip) that's not
> reppresentable "in the specified output encoding iso-8859-1".
>
> I made sure I had <xml:output encoding="UTF-8"> everywhere, but the
> problem persisted. Finally I figured out that I needed to check the
> encoding parameters in web.xml. Sure enough, container-encoding and
> form-encoding were not set, and the comments indicate that they
> default to iso-8859-1.
>
> So I set the container-encoding to UTF-8, and that didn't have any
> effect. Only when I set form-encoding to UTF-8 did my problem go
> away. The thing is, the character that was causing the problem isn't
> coming from the request! I expected container-encoding to be the one
> that would effect the behavior I was seeing.
>
> So, am I just not understanding something correctly? Or is it a bug,
> and if so is it a problem with Cocoon or with Jetty?
>
> Cheers,
> Mark
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org