You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lenya.apache.org by Hubertus Groepper <hu...@groepper.com> on 2004/12/10 11:30:49 UTC
WGet static export
Hi there
Dumb question:
Is http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26987 (absolute
path in pages of exported static site)
and http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26986
(Navigation inconsistent for export of static site)
and http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26708
(StaticHTMLExporter/WGet is not exporting css files in link tags)
still valid in light of comment #3:
the stattic exporter will be replaced by the cocoon cli so it makes
little sense to fix this now
of that last bug, from Gregor on 2004-03-24?
If yes (or unrelated), maybe this is an issue with wrong parameters
being passed on to WGet in j.o.a.lenya.net.WGet?
I do export my site (off the live area) with
wget -r -k -np -nH -N -p --cut-dirs=3
http://localhost:8080/lenya/default/live/index.html
and don't experience any of the aforementioned problems.
And in my usual attempt to melange two threads into one on my road to
enlightenment: what's the current recommended approach to export to
static?
Thanks.
hubertus
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Comiotto Thomas <th...@unicom.unizh.ch>.
>
> did you try build export -Dpublication=default
>
> i just checked in 2 fixes to that target, but it needs more work. at
> least it crawled the execption ;)
>
Will give it another try then!
Thomas
> --
> Gregor J. Rothfuss
> COO, Wyona Content Management Solutions http://wyona.com
> Apache Lenya http://lenya.apache.org
> gregor.rothfuss@wyona.com gregor@apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> Apache Lenya Project http://lenya.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Comiotto Thomas wrote:
> Now since the cli export task seems to be broken - can I do something
> to help fixing it?
did you try build export -Dpublication=default
i just checked in 2 fixes to that target, but it needs more work. at
least it crawled the execption ;)
--
Gregor J. Rothfuss
COO, Wyona Content Management Solutions http://wyona.com
Apache Lenya http://lenya.apache.org
gregor.rothfuss@wyona.com gregor@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Jean Pierre LeJacq <jp...@quoininc.com>.
On Mon, 13 Dec 2004, Comiotto Thomas wrote:
> Hello Jean-Pierre
>
>
> >> rting and not about re-importing ourselves, do we?
> >
> > I assume you mean using the lenya sitetree.xml file to crawl. This
> > isn't sufficient since it doesn't list all resources such as CSS
> > files, lenya assets, etc.
>
> Now since the cli export task seems to be broken - can I do
> something to help fixing it?
Well I'm moving over to 2.1.6 right now. Once I have that in place
I'd like to look at how forrest uses it. My thinking is to reuse
this if possible.
--
JP
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Comiotto Thomas <th...@unicom.unizh.ch>.
Hello Jean-Pierre
>> rting and not about re-importing ourselves, do we?
>
> I assume you mean using the lenya sitetree.xml file to crawl. This
> isn't sufficient since it doesn't list all resources such as CSS
> files, lenya assets, etc.
>
Now since the cli export task seems to be broken - can I do something
to help fixing it?
Thomas
> --
> JP
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> Apache Lenya Project http://lenya.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Jean Pierre LeJacq <jp...@quoininc.com>.
On Fri, 10 Dec 2004, Thomas Comiotto wrote:
> > I'm in fact working on a static site exporter based on the cocoon
> > CLI mode. I hope to have this available soon.
>
> Maybe we can join forces then - I'm currently using a sitemap-only
> based hack that uses the sitetree to fetch pages because the
> WGet/crawling-approach isn't flexible, stable and fast enough for what
> I need (exporting user-configurable subsets of a publication,
> forms-based navigation). Now I want to move that to cocoon CLI.
>
> Still, in contrast to what was agreed here crawling a publication in
> the sense of trying to find out something I already know (site
> structure & contents) doesn't really make sense to me. After all we're
> talking about exporting and not about re-importing ourselves, do we?
I assume you mean using the lenya sitetree.xml file to crawl. This
isn't sufficient since it doesn't list all resources such as CSS
files, lenya assets, etc.
--
JP
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Thomas Comiotto <co...@rcfmedia.ch>.
Hello Jean-Pierre
>> ter spent to move it over to cocoon cli.
>
> I'm in fact working on a static site exporter based on the cocoon
> CLI mode. I hope to have this available soon.
Maybe we can join forces then - I'm currently using a sitemap-only
based hack that uses the sitetree to fetch pages because the
WGet/crawling-approach isn't flexible, stable and fast enough for what
I need (exporting user-configurable subsets of a publication,
forms-based navigation). Now I want to move that to cocoon CLI.
Still, in contrast to what was agreed here crawling a publication in
the sense of trying to find out something I already know (site
structure & contents) doesn't really make sense to me. After all we're
talking about exporting and not about re-importing ourselves, do we?
Regards
Thomas
> --
> JP
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> Apache Lenya Project http://lenya.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Comiotto Thomas <th...@unicom.unizh.ch>.
>
>>>
>>> on the contrary, this allows you to reuse the live pipelines very
>>> easily, without having to second-guess lenya.
>>>
>> True - but why can't we just reuse pipelines internally?
>
> you would still have to build the part that fetches the pages and
> stores them somewhere, with directory structure intact. which is what
> WGet does, and CLI too.
>
Sure - but you still might need a facility to export a pub to some
other (eventually web unaware) format, like in my case ELML for
instance (markup targeted at eLearning szenarios - they put all of the
contents including navigational info into one file). Took me a couple
of hours to extend that to support xhtml and it works like a charm.
But yeah - I am of course also in favor of standardized hybridity! And
in favor of the cli.
Bests
Thomas
> --
> Gregor J. Rothfuss
> COO, Wyona Content Management Solutions http://wyona.com
> Apache Lenya http://lenya.apache.org
> gregor.rothfuss@wyona.com gregor@apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> Apache Lenya Project http://lenya.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Comiotto Thomas wrote:
>>
>> on the contrary, this allows you to reuse the live pipelines very
>> easily, without having to second-guess lenya.
>>
>
> True - but why can't we just reuse pipelines internally?
you would still have to build the part that fetches the pages and stores
them somewhere, with directory structure intact. which is what WGet
does, and CLI too.
--
Gregor J. Rothfuss
COO, Wyona Content Management Solutions http://wyona.com
Apache Lenya http://lenya.apache.org
gregor.rothfuss@wyona.com gregor@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Comiotto Thomas <th...@unicom.unizh.ch>.
>
> on the contrary, this allows you to reuse the live pipelines very
> easily, without having to second-guess lenya.
>
True - but why can't we just reuse pipelines internally?
Bests
Thomas
> --
> Gregor J. Rothfuss
> COO, Wyona Content Management Solutions http://wyona.com
> Apache Lenya http://lenya.apache.org
> gregor.rothfuss@wyona.com gregor@apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> Apache Lenya Project http://lenya.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Comiotto Thomas wrote:
> Maybe we can join forces - I'm currently using a sitemap-only based hack
> that uses the sitetree to fetch pages because the WGet/crawling-approach
> isn't flexible, stable and fast enough for what I need (exporting
> user-configurable subsets of a publication, forms-based navigation). Now
> I want to move that to cocoon CLI.
>
> Still, in contrast to what was agreed here crawling a publication in the
> sense of trying to find out something I already know (site structure &
> contents) doesn't really make sense to me. After all we're talking about
> exporting and not about re-importing ourselves, do we?
on the contrary, this allows you to reuse the live pipelines very
easily, without having to second-guess lenya.
--
Gregor J. Rothfuss
COO, Wyona Content Management Solutions http://wyona.com
Apache Lenya http://lenya.apache.org
gregor.rothfuss@wyona.com gregor@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Comiotto Thomas <th...@unicom.unizh.ch>.
Hello Jean-Pierre
>> ter spent to move it over to cocoon cli.
>
> I'm in fact working on a static site exporter based on the cocoon
> CLI mode. I hope to have this available soon.
Maybe we can join forces - I'm currently using a sitemap-only based
hack that uses the sitetree to fetch pages because the
WGet/crawling-approach isn't flexible, stable and fast enough for what
I need (exporting user-configurable subsets of a publication,
forms-based navigation). Now I want to move that to cocoon CLI.
Still, in contrast to what was agreed here crawling a publication in
the sense of trying to find out something I already know (site
structure & contents) doesn't really make sense to me. After all we're
talking about exporting and not about re-importing ourselves, do we?
Regards
Thomas
> --
> JP
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> Apache Lenya Project http://lenya.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by Jean Pierre LeJacq <jp...@quoininc.com>.
On Fri, 10 Dec 2004, Gregor J. Rothfuss wrote:
> Hubertus Groepper wrote:
> >
> > Is http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26987 (absolute
> > path in pages of exported static site)
> > and http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26986
> > (Navigation inconsistent for export of static site)
> > and http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26708
> > (StaticHTMLExporter/WGet is not exporting css files in link tags)
> > still valid in light of comment #3:
> > the stattic exporter will be replaced by the cocoon cli so it makes
> > little sense to fix this now
> > of that last bug, from Gregor on 2004-03-24?
> >
> > If yes (or unrelated), maybe this is an issue with wrong parameters
> > being passed on to WGet in j.o.a.lenya.net.WGet?
>
> not a dumb question at all. it seems uneccessary to have our own crawler
> (WGet) when Cocoon has a perfectly fine CLI mode for this purpose. thus
> my comment. if you are so inclined, you are welcome to fix WGet, but
> maybe that time would be better spent to move it over to cocoon cli.
I'm in fact working on a static site exporter based on the cocoon
CLI mode. I hope to have this available soon.
--
JP
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org
Re: WGet static export
Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Hubertus Groepper wrote:
> Hi there
>
> Dumb question:
> Is http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26987 (absolute
> path in pages of exported static site)
> and http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26986
> (Navigation inconsistent for export of static site)
> and http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=26708
> (StaticHTMLExporter/WGet is not exporting css files in link tags)
> still valid in light of comment #3:
> the stattic exporter will be replaced by the cocoon cli so it makes
> little sense to fix this now
> of that last bug, from Gregor on 2004-03-24?
>
> If yes (or unrelated), maybe this is an issue with wrong parameters
> being passed on to WGet in j.o.a.lenya.net.WGet?
not a dumb question at all. it seems uneccessary to have our own crawler
(WGet) when Cocoon has a perfectly fine CLI mode for this purpose. thus
my comment. if you are so inclined, you are welcome to fix WGet, but
maybe that time would be better spent to move it over to cocoon cli.
> I do export my site (off the live area) with
>
> wget -r -k -np -nH -N -p --cut-dirs=3
> http://localhost:8080/lenya/default/live/index.html
>
> and don't experience any of the aforementioned problems.
that is always an option, although not a portable one, and does not
happen automatically when you publish (which is presumably what you'd want)
> And in my usual attempt to melange two threads into one on my road to
> enlightenment: what's the current recommended approach to export to static?
if wget -r works fine for you, by all means use it. if you are
interested to improve lenya's own export facilities, then you should
take a look at cocoon cli.
--
Gregor J. Rothfuss
COO, Wyona Content Management Solutions http://wyona.com
Apache Lenya http://lenya.apache.org
gregor.rothfuss@wyona.com gregor@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Apache Lenya Project http://lenya.apache.org