You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Leszek Gawron <ou...@wlkp.org> on 2003/11/05 16:33:51 UTC

encoding problem

Current CVS version of cocoon:

ServerPagesGenerator.generate(): java.lang.RuntimeException:
org.xml.sax.SAXException: Attempt to output character of integral value 346
that is not represented in specified output encoding of .

I am using UTF-8 everywhere and switching to current cocoon cvs version either
displays the above error or just messes up all polish characters in my html
pages.

	LG
-- 
            __
         | /  \ |        Leszek Gawron            //  \\
        \_\\  //_/       ouzo@wlkp.org           _\\()//_
         .'/()\'.     Phone: +48(501)720812     / //  \\ \
          \\  //  recursive: adj; see recursive  | \__/ |


Re: encoding problem

Posted by Marc Portier <mp...@outerthought.org>.

Leszek Gawron wrote:
> On Wed, Nov 05, 2003 at 04:33:51PM +0100, Leszek Gawron wrote:
> 
>>Current CVS version of cocoon:
>>
>>ServerPagesGenerator.generate(): java.lang.RuntimeException:
>>org.xml.sax.SAXException: Attempt to output character of integral value 346
>>that is not represented in specified output encoding of .
>>
>>I am using UTF-8 everywhere and switching to current cocoon cvs version either
>>displays the above error or just messes up all polish characters in my html
>>pages.
> 

hm, I did some minor tests with funny chars in XML serialized into 
ISO-8859-1 encoded files and they all nicely were converted into &#....; 
  character-entities (which admittedly don't look that nice, but at 
least 'work')

I did not use any polish characters though

the only problem I would expect is when your polish characters need to 
show up in the xmlnames of elements and attributes: there you can't have 
character-entities and thus the file-encoding must just be right...


> I am sorry. This has got nothing to do with cocoon. It's tomcat's 4.1.29
> fault. It sets Content-Type header to ISO-8859-1 .. strange
> 	lg

Are you sure?
I'm afraid this _has_ something to do with cocoon...


pls check
- discussion:
http://marc.theaimsgroup.com/?t=106760662600010&r=1&w=2

- recent commit:
http://marc.theaimsgroup.com/?l=xml-cocoon-cvs&m=106789462214616&w=2

and give your opinion...


(I am a bit condfused by the ServerPagesGenerator part in this, but I 
guess it's just about having only a piece off the stacktrace?)

but anyway:

here and now you can safely get back to utf-8 on the serialized output 
(and consistently also change the encoding for 
request-parameter-encoding) by changing the 'form-encoding' init-param 
to the cocoon-servlet in the web.xml of cocoon.

alternatively you can re-create the former incosistent behaviour by only 
setting the utf-8 encoding on the html serializer.

(any of the above quick tests will probably tell us fast if this is 
indeed cocoon related or not)

HTH
-marc=
-- 
Marc Portier                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at              http://radio.weblogs.com/0116284/
mpo@outerthought.org                              mpo@apache.org


Re: encoding problem

Posted by Leszek Gawron <ou...@wlkp.org>.
On Wed, Nov 05, 2003 at 04:33:51PM +0100, Leszek Gawron wrote:
> Current CVS version of cocoon:
> 
> ServerPagesGenerator.generate(): java.lang.RuntimeException:
> org.xml.sax.SAXException: Attempt to output character of integral value 346
> that is not represented in specified output encoding of .
> 
> I am using UTF-8 everywhere and switching to current cocoon cvs version either
> displays the above error or just messes up all polish characters in my html
> pages.
I am sorry. This has got nothing to do with cocoon. It's tomcat's 4.1.29
fault. It sets Content-Type header to ISO-8859-1 .. strange
	lg
-- 
            __
         | /  \ |        Leszek Gawron            //  \\
        \_\\  //_/       ouzo@wlkp.org           _\\()//_
         .'/()\'.     Phone: +48(501)720812     / //  \\ \
          \\  //  recursive: adj; see recursive  | \__/ |