You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Davanum Srinivas <di...@yahoo.com> on 2000/10/27 22:50:59 UTC

[C2] Generator which can do HTML->XHTML

HTMLGenerator can be used to collect content from external web sites. It uses JTidy from
http://lempinen.net/sami/jtidy/. 

This was inspired by the following thread 
   http://marc.theaimsgroup.com/?t=97240792100003&w=2&r=1

Here's the generator entry.
   <map:generator  name="html" src="org.apache.cocoon.generation.HTMLGenerator"/>

Here's the pipeline entry.
   <map:match pattern="javasoft">
    <map:generate type="html" src="http://www.javasoft.com"/> 
    ....
    ....
   </map:match>

If you are within a firewall, make sure you set the following params in the java.exe command line.

   java.exe -DproxySet=true -DproxyHost=caproxy.cai.com -DproxyPort=80 ....

Thanks,
dims


=====
Davanum Srinivas, JNI-FAQ Manager
http://www.jGuru.com/faq/JNI

__________________________________________________
Do You Yahoo!?
Yahoo! Messenger - Talk while you surf!  It's FREE.
http://im.yahoo.com/

RE: [C2] Generator which can do HTML->XHTML

Posted by Per Kreipke <pe...@onclave.com>.
Davanum,

Nicely done.

Couple of questions (mostly for learning):

- What will happen with the Tidy.jar file needed to use this generator? Will
it be included in the C2 install?

- Is the stream returned from Url.openStream() buffered? If so, ok. If not,
why not use a buffered stream since it's a network request:

Document newdoc = tidy.parseDOM(new BufferedInputStream(url.openStream()),
ostream);

Only asking because this is what I used under C1. Too bad JTidy can't emit
SAX events (understandably).

- Why use a ByteArrayStream as the intermediate stream?

Per.

> Thanks, applied
>
> Giacomo
>
> Davanum Srinivas wrote:
> >
> > HTMLGenerator can be used to collect content from external web
> sites. It uses JTidy from
> > http://lempinen.net/sami/jtidy/.
> >
> > This was inspired by the following thread
> >    http://marc.theaimsgroup.com/?t=97240792100003&w=2&r=1
> >
> > Here's the generator entry.
> >    <map:generator  name="html"
> src="org.apache.cocoon.generation.HTMLGenerator"/>
> >
> > Here's the pipeline entry.
> >    <map:match pattern="javasoft">
> >     <map:generate type="html" src="http://www.javasoft.com"/>
> >     ....
> >     ....
> >    </map:match>
> >
> > If you are within a firewall, make sure you set the following
> params in the java.exe command line.
> >
> >    java.exe -DproxySet=true -DproxyHost=caproxy.cai.com
> -DproxyPort=80 ....
> >
> > Thanks,
> > dims
> >
> > =====
> > Davanum Srinivas, JNI-FAQ Manager
> > http://www.jGuru.com/faq/JNI
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Yahoo! Messenger - Talk while you surf!  It's FREE.
> > http://im.yahoo.com/
> >
> >
> ------------------------------------------------------------------------
> >                             Name: HTMLGenerator.java
> >    HTMLGenerator.java       Type: unspecified type
> (application/octet-stream)
> >                         Encoding: base64
> >                      Description: HTMLGenerator.java
>


Re: [C2] Generator which can do HTML->XHTML

Posted by Giacomo Pati <gi...@apache.org>.
Thanks, applied

Giacomo

Davanum Srinivas wrote:
> 
> HTMLGenerator can be used to collect content from external web sites. It uses JTidy from
> http://lempinen.net/sami/jtidy/.
> 
> This was inspired by the following thread
>    http://marc.theaimsgroup.com/?t=97240792100003&w=2&r=1
> 
> Here's the generator entry.
>    <map:generator  name="html" src="org.apache.cocoon.generation.HTMLGenerator"/>
> 
> Here's the pipeline entry.
>    <map:match pattern="javasoft">
>     <map:generate type="html" src="http://www.javasoft.com"/>
>     ....
>     ....
>    </map:match>
> 
> If you are within a firewall, make sure you set the following params in the java.exe command line.
> 
>    java.exe -DproxySet=true -DproxyHost=caproxy.cai.com -DproxyPort=80 ....
> 
> Thanks,
> dims
> 
> =====
> Davanum Srinivas, JNI-FAQ Manager
> http://www.jGuru.com/faq/JNI
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Messenger - Talk while you surf!  It's FREE.
> http://im.yahoo.com/
> 
>   ------------------------------------------------------------------------
>                             Name: HTMLGenerator.java
>    HTMLGenerator.java       Type: unspecified type (application/octet-stream)
>                         Encoding: base64
>                      Description: HTMLGenerator.java