You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Davanum Srinivas <di...@yahoo.com> on 2000/10/27 22:50:59 UTC
[C2] Generator which can do HTML->XHTML
HTMLGenerator can be used to collect content from external web sites. It uses JTidy from
http://lempinen.net/sami/jtidy/.
This was inspired by the following thread
http://marc.theaimsgroup.com/?t=97240792100003&w=2&r=1
Here's the generator entry.
<map:generator name="html" src="org.apache.cocoon.generation.HTMLGenerator"/>
Here's the pipeline entry.
<map:match pattern="javasoft">
<map:generate type="html" src="http://www.javasoft.com"/>
....
....
</map:match>
If you are within a firewall, make sure you set the following params in the java.exe command line.
java.exe -DproxySet=true -DproxyHost=caproxy.cai.com -DproxyPort=80 ....
Thanks,
dims
=====
Davanum Srinivas, JNI-FAQ Manager
http://www.jGuru.com/faq/JNI
__________________________________________________
Do You Yahoo!?
Yahoo! Messenger - Talk while you surf! It's FREE.
http://im.yahoo.com/
RE: [C2] Generator which can do HTML->XHTML
Posted by Per Kreipke <pe...@onclave.com>.
Davanum,
Nicely done.
Couple of questions (mostly for learning):
- What will happen with the Tidy.jar file needed to use this generator? Will
it be included in the C2 install?
- Is the stream returned from Url.openStream() buffered? If so, ok. If not,
why not use a buffered stream since it's a network request:
Document newdoc = tidy.parseDOM(new BufferedInputStream(url.openStream()),
ostream);
Only asking because this is what I used under C1. Too bad JTidy can't emit
SAX events (understandably).
- Why use a ByteArrayStream as the intermediate stream?
Per.
> Thanks, applied
>
> Giacomo
>
> Davanum Srinivas wrote:
> >
> > HTMLGenerator can be used to collect content from external web
> sites. It uses JTidy from
> > http://lempinen.net/sami/jtidy/.
> >
> > This was inspired by the following thread
> > http://marc.theaimsgroup.com/?t=97240792100003&w=2&r=1
> >
> > Here's the generator entry.
> > <map:generator name="html"
> src="org.apache.cocoon.generation.HTMLGenerator"/>
> >
> > Here's the pipeline entry.
> > <map:match pattern="javasoft">
> > <map:generate type="html" src="http://www.javasoft.com"/>
> > ....
> > ....
> > </map:match>
> >
> > If you are within a firewall, make sure you set the following
> params in the java.exe command line.
> >
> > java.exe -DproxySet=true -DproxyHost=caproxy.cai.com
> -DproxyPort=80 ....
> >
> > Thanks,
> > dims
> >
> > =====
> > Davanum Srinivas, JNI-FAQ Manager
> > http://www.jGuru.com/faq/JNI
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Yahoo! Messenger - Talk while you surf! It's FREE.
> > http://im.yahoo.com/
> >
> >
> ------------------------------------------------------------------------
> > Name: HTMLGenerator.java
> > HTMLGenerator.java Type: unspecified type
> (application/octet-stream)
> > Encoding: base64
> > Description: HTMLGenerator.java
>
Re: [C2] Generator which can do HTML->XHTML
Posted by Giacomo Pati <gi...@apache.org>.
Thanks, applied
Giacomo
Davanum Srinivas wrote:
>
> HTMLGenerator can be used to collect content from external web sites. It uses JTidy from
> http://lempinen.net/sami/jtidy/.
>
> This was inspired by the following thread
> http://marc.theaimsgroup.com/?t=97240792100003&w=2&r=1
>
> Here's the generator entry.
> <map:generator name="html" src="org.apache.cocoon.generation.HTMLGenerator"/>
>
> Here's the pipeline entry.
> <map:match pattern="javasoft">
> <map:generate type="html" src="http://www.javasoft.com"/>
> ....
> ....
> </map:match>
>
> If you are within a firewall, make sure you set the following params in the java.exe command line.
>
> java.exe -DproxySet=true -DproxyHost=caproxy.cai.com -DproxyPort=80 ....
>
> Thanks,
> dims
>
> =====
> Davanum Srinivas, JNI-FAQ Manager
> http://www.jGuru.com/faq/JNI
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Messenger - Talk while you surf! It's FREE.
> http://im.yahoo.com/
>
> ------------------------------------------------------------------------
> Name: HTMLGenerator.java
> HTMLGenerator.java Type: unspecified type (application/octet-stream)
> Encoding: base64
> Description: HTMLGenerator.java