You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Florian Haas <f....@gmx.net> on 2004/01/22 17:40:24 UTC

SAXException: Attempt to output character that is not represented in specified output encoding

Hello,

I've recently run into a Forrest issue (CVS HEAD from this morning) which
puzzles me. Suddenly, wenn I access my index.html or any other document served
by "forrest run", I get the following exception:

17:28:04.842 EVENT  ERROR   (2004-01-22) 17:28.04:838   [access  ]
(/index.html) PoolThread-4/CocoonServlet: Internal Cocoon Problem
org.apache.cocoon.ProcessingException: Failed to execute pipeline.:
org.apache.cocoon.ProcessingException: Failed to execute pipeline.:
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.RuntimeException: java.lang.RuntimeException: org.xml.sax.SAXException:
Attempt to output character of integral value 196 that is not represented in
specified output encoding of .
        at
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:564)
        at
org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:196)
        at
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:501)
        at
org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(SerializeNode.java:147)
        at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:108)
        at
org.apache.cocoon.components.treeprocessor.ContainerNode.invoke(ContainerNode.java:66)
        at
org.apache.cocoon.components.treeprocessor.sitemap.CallNode.invoke(CallNode.java:128)

The documentation is in German and Unicode 196 is Ä (capital A umlaut), so
suspecting an encoding issue is, I suppose, not too far-fetched. :-) However,
all my documents are ISO 8859-1 and use either ISO entities or Unicode
escapes for any non-ASCII characters. Trying to track down the problem further is
currently getting me nowhere -- setting  project.debuglevel to DEBUG in
forrest.properties generates no additional information in core.log, sitemap.log or
error.log except for a more detailed stack trace of the above exception.

What puzzles me most is the "specified output encoding of ." in the
exception message -- shouldn't the message contain the name of the encoding actually
being used, even if it's the default? 

The problem does not apply when I access one of my documents as PDF.
Accessing site.pdf (whole site as PDF), however, does cause the problem to occur. So
I do suspect that one of my documents contains something nasty -- any
suggestions on how to find out which one, given the fact that setting the
project.debuglevel to DEBUG won't do it? 

Thanks a lot in advance,
Florian

-- 
+++ GMX - die erste Adresse für Mail, Message, More +++
Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail


Re: SAXException: Attempt to output character that is not represented in specified output encoding

Posted by Florian Haas <f....@gmx.net>.
Hello Johan,

> The same problem occurs if you have accentuated characters within the 
> XML documents. Also check for a dash, as I picked up problems even with 
> that.

OK, (temporarily) fixed. I misunterstood Eric's post and thought he was
referring to only to unescaped non-ASCII characters in site.xml. But apparently
the problem is also caused by Unicode escapes, such as &#196;. Bummer.

I rewrote my site.xml to not include any non-ASCII characters, the exception
is gone, and I can continue working. Still, the situation IMHO is very much
sub-optimal -- I suppose I'm not the only person out there who would prefer
to use the full character set available in his own native language. :-)

I'm still wondering, does this constitute a Forrest issue, or is this a
Cocoon bug?

Thanks again to you and Eric for your help.

Florian



-- 
+++ GMX - die erste Adresse für Mail, Message, More +++
Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail


Re: SAXException: Attempt to output character that is not represented in specified output encoding

Posted by Florian Haas <f....@gmx.net>.
Hi again.

> Try so see if it's a problem from the side menu generation or from the
> content generation: try
> 
> 1- body-your_buggy_page.html (This one should display. I've got no problem
> with xml document content)

Yes. This works.

> 2- book-your_buggy_page.html (fail for me if tab and id attributes are
> accentued)

This one fails for me.

> 2b menulinks-your_buggy_page (idem)

This one works for me, interestingly.

In addition, I've found out in the meantime that I can use non-ASCII
characters anywhere but in a "description" attribute. That is to say if I have an
element with label="Übersicht", it works, but if I add description="Übersicht"
to the same element, I get the original error again. 

Puzzling.

Florian

-- 
+++ GMX - die erste Adresse für Mail, Message, More +++
Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail


Re: SAXException: Attempt to output character that is not represented in specified output encoding

Posted by BURGHARD Eric <eb...@free.fr>.
Johan Kok wrote:

> The same problem occurs if you have accentuated characters within the
> XML documents. Also check for a dash, as I picked up problems even with
> that.
> 
> Johan Kok
> 
> Florian Haas wrote:
> 

Try so see if it's a problem from the side menu generation or from the
content generation: try

1- body-your_buggy_page.html (This one should display. I've got no problem
with xml document content)
2- book-your_buggy_page.html (fail for me if tab and id attributes are
accentued)
2b menulinks-your_buggy_page (idem)



Re: SAXException: Attempt to output character that is not represented in specified output encoding

Posted by Johan Kok <jk...@messianic.dyndns.org>.
The same problem occurs if you have accentuated characters within the 
XML documents. Also check for a dash, as I picked up problems even with 
that.

Johan Kok

Florian Haas wrote:

>Hello Eric.
>
>Thanks for your reply.
>
>  
>
>>You can fix it by removing all accentued car from id attributes of your
>>tabs.xml and thus from tab attributes of your site.xml. It seems that the
>>xsl transformer can't deal with accentued car in tab attribute content on
>>the forrest cvs version.
>>    
>>
>
>Alas, no luck. I don't have any non-US-ASCII characters (not even as Unicode
>escapes) in id attributes in tabs.xml, nor in tab attrs in site.xml.
>
>I'll try to investigate further, using the other clues you provided. Thanks
>again.
>
>Any other insights on this matter?
>Florian 
>
>
>
>  
>

Re: SAXException: Attempt to output character that is not represented in specified output encoding

Posted by Florian Haas <f....@gmx.net>.
Hello Eric.

Thanks for your reply.

> You can fix it by removing all accentued car from id attributes of your
> tabs.xml and thus from tab attributes of your site.xml. It seems that the
> xsl transformer can't deal with accentued car in tab attribute content on
> the forrest cvs version.

Alas, no luck. I don't have any non-US-ASCII characters (not even as Unicode
escapes) in id attributes in tabs.xml, nor in tab attrs in site.xml.

I'll try to investigate further, using the other clues you provided. Thanks
again.

Any other insights on this matter?
Florian 



-- 
+++ GMX - die erste Adresse für Mail, Message, More +++
Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail


Re: SAXException: Attempt to output character that is not represented in specified output encoding

Posted by BURGHARD Eric <eb...@free.fr>.
Hi everybody (quite a long time since my last post so this list. happy new
year ;-)

I've got the same error. Here is the state of my knowledge about this issue
(which is not resolved entirely yet ;-)

I suspect the problem is comming from a cocoon component linked with a wrong
version of endorsed jars, because i already encountered this behavior when
i put new version of theses jars into lib/endorsed with an old version of
forrest. Copying back the old files fix the problem. Now i don't realy know
where it comes from.

I just know that the problem arise from the **menulinks-* pipeline during
the generation of the side menu. More specificaly at the <map:transform
src="{forrest:stylesheets}/site2site-normalizetabs.xsl" /> stage in
menus.xmap. This is why the bug didn't impact pdf generation which had no
menu at all.

You can fix it by removing all accentued car from id attributes of your
tabs.xml and thus from tab attributes of your site.xml. It seems that the
xsl transformer can't deal with accentued car in tab attribute content on
the forrest cvs version.

Hope it helps.

Eric BURGHARD


Re: SAXException: Attempt to output character that is not represented in specified output encoding

Posted by Eirik Bjørsnøs <bj...@stud.ntnu.no>.
> java.lang.RuntimeException: java.lang.RuntimeException:
> org.xml.sax.SAXException: Attempt to output character of integral value
> 196 that is not represented in

Hello Florian,

I'll eat my socks if this problem  is not related to a bug in the version
2.5.2 of Xalan used in Cocoon 2.1.3

The Cocoon and Xalan jars used by Forrest was upgraded last week, right?

The bug description in Coocon:
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24457

this depends on the following bug in Xalan:
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24278

You could try downgrading to Xalan 2.5.1 or compile the latest version
from CVS and see if that fixes the problem.

Another workaround mentioned is to use xsltc instead of xslt, but I'm not
sure if that's supported in Forrest.

It would be nice if someone could help us verify that the bugs are
related, since the stacktrace didn't really match well.

Eirik.
-- 
Eirik Bjørsnøs
bjorsnos@stud.ntnu.no