You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Reinhard Haller <re...@interactive-net.de> on 2005/06/01 07:34:57 UTC
Status JTidy
Hi,
I had a strange error regarding the encoding of the xml serializer
using the following pipeline:
<map:match pattern="cwx.xhtml">
<map:generate type="html-utf-8"
src="http://www.computerwoche.de/index.cfm?pageid=254&type=detail&artid=74090&linktype=rss&category=318"
/>
<map:serialize type="xml"/>
</map:match>
It turned out JTidy produced invalid XML output and the
serializer refused to convert the encoding from utf-8 to iso-8859-1.
All this occurred without error messages (JTidy issued a lot
of warnings).
JTidy as Tidy is not able to mask out script areas with offending
characters not enclosed in HTML comment as Browsers do:
<SCRIPT>
var adrandno=Math.floor(Math.random()*100);
document.write('<scr'+'ipt language="JavaScript1.1"
src="http://ad.de.doubleclick.net/adj/cowo.de/knowledge/it_security/bigsky;abr=!webtv;sz=160x800;ord='+adrandno+'?"><\/scr'+'ipt>');
if ((!document.images &&
navigator.userAgent.indexOf('Mozilla/2.') >= 0) ||
navigator.userAgent.indexOf("WebTV")>= 0) {
document.write('<a
HREF="http://ad.de.doubleclick.net/jump/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
>');
document.write('<img
SRC="http://ad.de.doubleclick.net/ad/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
border="0" height="800" width="160"></a>')
}
</SCRIPT>
The last release version for JTidy is from 2001.
Is there a newer version around?
Is the project still alive?
Greetings
Reinhard Haller
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: Status JTidy
Posted by Jon Merriman <jm...@gonzaga.edu>.
>
> The last release version for JTidy is from 2001.
>
> Is there a newer version around?
> Is the project still alive?
Yes. Check out the nightly builds (last updated January this year). A
number of improvements have been made since the 2001 release.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: Status JTidy
Posted by Upayavira <uv...@odoko.co.uk>.
You can use nekohtml instead of jtidy. It is built in to the HTML
serializer. See the org.apache.cocoon.generators.NekoHTMLGenerator in
the HTML block.
Regards, Upayavira
Reinhard Haller wrote:
> Hi,
>
> I had a strange error regarding the encoding of the xml serializer
> using the following pipeline:
>
> <map:match pattern="cwx.xhtml">
> <map:generate type="html-utf-8"
> src="http://www.computerwoche.de/index.cfm?pageid=254&type=detail&artid=74090&linktype=rss&category=318"
> />
> <map:serialize type="xml"/>
> </map:match>
>
> It turned out JTidy produced invalid XML output and the
> serializer refused to convert the encoding from utf-8 to iso-8859-1.
> All this occurred without error messages (JTidy issued a lot
> of warnings).
>
> JTidy as Tidy is not able to mask out script areas with offending
> characters not enclosed in HTML comment as Browsers do:
>
> <SCRIPT>
> var adrandno=Math.floor(Math.random()*100);
> document.write('<scr'+'ipt language="JavaScript1.1"
> src="http://ad.de.doubleclick.net/adj/cowo.de/knowledge/it_security/bigsky;abr=!webtv;sz=160x800;ord='+adrandno+'?"><\/scr'+'ipt>');
> if ((!document.images &&
> navigator.userAgent.indexOf('Mozilla/2.') >= 0) ||
> navigator.userAgent.indexOf("WebTV")>= 0) {
> document.write('<a
> HREF="http://ad.de.doubleclick.net/jump/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
>
>>');
>
> document.write('<img
> SRC="http://ad.de.doubleclick.net/ad/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
> border="0" height="800" width="160"></a>')
> }
> </SCRIPT>
>
> The last release version for JTidy is from 2001.
>
> Is there a newer version around?
> Is the project still alive?
>
> Greetings
> Reinhard Haller
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org