You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Reinhard Haller <re...@interactive-net.de> on 2005/06/01 07:34:57 UTC

Status JTidy

Hi,

I had a strange error regarding the encoding of the xml serializer
using the following pipeline:

<map:match pattern="cwx.xhtml">
   <map:generate type="html-utf-8"
src="http://www.computerwoche.de/index.cfm?pageid=254&amp;type=detail&amp;artid=74090&amp;linktype=rss&amp;category=318"
/>
   <map:serialize type="xml"/>
</map:match>

It turned out JTidy produced invalid XML output and the
serializer refused to convert the encoding from utf-8 to iso-8859-1.
All this occurred without error messages (JTidy issued a lot 
of warnings).

JTidy as Tidy is not able to mask out script areas with offending
characters not enclosed in HTML comment as Browsers do:

<SCRIPT>
var adrandno=Math.floor(Math.random()*100);
document.write('<scr'+'ipt language="JavaScript1.1"
src="http://ad.de.doubleclick.net/adj/cowo.de/knowledge/it_security/bigsky;abr=!webtv;sz=160x800;ord='+adrandno+'?"><\/scr'+'ipt>');
	if ((!document.images &&
navigator.userAgent.indexOf('Mozilla/2.') >= 0)  ||
navigator.userAgent.indexOf("WebTV")>= 0) {
		document.write('<a
HREF="http://ad.de.doubleclick.net/jump/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
>');
		document.write('<img
SRC="http://ad.de.doubleclick.net/ad/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
border="0" height="800" width="160"></a>')
	}
</SCRIPT>

The last release version for JTidy is from 2001. 

Is there a newer version around? 
Is the project still alive?

Greetings
Reinhard Haller


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Status JTidy

Posted by Jon Merriman <jm...@gonzaga.edu>.
>
> The last release version for JTidy is from 2001.
>
> Is there a newer version around?
> Is the project still alive?

Yes. Check out the nightly builds (last updated January this year). A  
number of improvements have been made since the 2001 release.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Status JTidy

Posted by Upayavira <uv...@odoko.co.uk>.
You can use nekohtml instead of jtidy. It is built in to the HTML 
serializer. See the org.apache.cocoon.generators.NekoHTMLGenerator in 
the HTML block.

Regards, Upayavira

Reinhard Haller wrote:
> Hi,
> 
> I had a strange error regarding the encoding of the xml serializer
> using the following pipeline:
> 
> <map:match pattern="cwx.xhtml">
>    <map:generate type="html-utf-8"
> src="http://www.computerwoche.de/index.cfm?pageid=254&amp;type=detail&amp;artid=74090&amp;linktype=rss&amp;category=318"
> />
>    <map:serialize type="xml"/>
> </map:match>
> 
> It turned out JTidy produced invalid XML output and the
> serializer refused to convert the encoding from utf-8 to iso-8859-1.
> All this occurred without error messages (JTidy issued a lot 
> of warnings).
> 
> JTidy as Tidy is not able to mask out script areas with offending
> characters not enclosed in HTML comment as Browsers do:
> 
> <SCRIPT>
> var adrandno=Math.floor(Math.random()*100);
> document.write('<scr'+'ipt language="JavaScript1.1"
> src="http://ad.de.doubleclick.net/adj/cowo.de/knowledge/it_security/bigsky;abr=!webtv;sz=160x800;ord='+adrandno+'?"><\/scr'+'ipt>');
> 	if ((!document.images &&
> navigator.userAgent.indexOf('Mozilla/2.') >= 0)  ||
> navigator.userAgent.indexOf("WebTV")>= 0) {
> 		document.write('<a
> HREF="http://ad.de.doubleclick.net/jump/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
> 
>>');
> 
> 		document.write('<img
> SRC="http://ad.de.doubleclick.net/ad/cowo.de/knowledge/it_security/bigsky;sz=160x800;ord='+adrandno+'?"
> border="0" height="800" width="160"></a>')
> 	}
> </SCRIPT>
> 
> The last release version for JTidy is from 2001. 
> 
> Is there a newer version around? 
> Is the project still alive?
> 
> Greetings
> Reinhard Haller
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org