You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lenya.apache.org by Peter Shipley <ps...@nomensa.com> on 2004/02/05 17:57:02 UTC

Static HTML Exporter - some resources not being exported

I have just tried looking at my exported files for the first time and 
noticed the html is saved off as well as the images coded into the HTML. 
All very nice. However, any images referenced in the CSS file are not 
automagically downloaded.

Anyone have any ideas on how to ensure that all resources referenced are 
downloaded (this would include javascript, flash etc.) ?

Is there any functionality I am missing or is it more of a case of 
extending the WGet.java, and supporting HTML utils ?

Regards

Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-user-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-user-help@cocoon.apache.org


Re: Static HTML Exporter - some resources not being exported

Posted by Michael Wechner <mi...@wyona.com>.
Gregor J. Rothfuss wrote:

> Peter Shipley wrote:
>
>>
>>
>> Is there any functionality I am missing or is it more of a case of 
>> extending the WGet.java, and supporting HTML utils ?
>
>
> one would need to extend the crawler



you can apply the method

org.apache.lenya.net.WGet.substitutePrefix()

to the CSS files which are being retrieved by WGet, whereas 
substitutePrefix() is using the class org.apache.lenya.util.SED

patches welcome ;-)

Michi



---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-user-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-user-help@cocoon.apache.org


Re: Static HTML Exporter - some resources not being exported

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Peter Shipley wrote:

> I have just tried looking at my exported files for the first time and 
> noticed the html is saved off as well as the images coded into the HTML. 
> All very nice. However, any images referenced in the CSS file are not 
> automagically downloaded.

hmm. the crawler is not smart enough then. it only knows about <img 
src=" and <link href=""

its hairy code.

> Anyone have any ideas on how to ensure that all resources referenced are 
> downloaded (this would include javascript, flash etc.) ?
> 
> Is there any functionality I am missing or is it more of a case of 
> extending the WGet.java, and supporting HTML utils ?

one would need to extend the crawler

-- 
Gregor J. Rothfuss
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://wyona.com                   http://cocoon.apache.org/lenya
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-user-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-user-help@cocoon.apache.org


Re: Static HTML Exporter - some resources not being exported

Posted by jp...@quoininc.com.
On Thu, 5 Feb 2004, Peter Shipley wrote:

> I have just tried looking at my exported files for the first time and
> noticed the html is saved off as well as the images coded into the HTML.
> All very nice. However, any images referenced in the CSS file are not
> automagically downloaded.
>
> Anyone have any ideas on how to ensure that all resources referenced are
> downloaded (this would include javascript, flash etc.) ?
>
> Is there any functionality I am missing or is it more of a case of
> extending the WGet.java, and supporting HTML utils ?

We've added a custom action that uses wget to spider the site.  Its
an indirect way to gather all resources.  I would prefer a process
controlled by the build process but this is what we're using now.

-- 
JP



---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-user-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-user-help@cocoon.apache.org