You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by David Crossley <cr...@indexgeo.com.au> on 2003/09/04 09:04:27 UTC
broken build when href=directory (Was: IHTML sample bug)
Juan Jose Pablos wrote:
> Upayavira wrote:
> >
> > Does the ihtml sample use the HTMLGenerator?
> >
> > Carsten has just fixed a CLI bug in the HTMLGenerator that prevented it
> > working without an HTTPEnvironment present. So if someone is able to
> > upgrade to CVS Cocoon and retest the IHTML sample, it might well work.
>
> I upgrade cocoon on my system, and that seems to fix this issue.
Great, but i think that this second thing is a different issue.
Hence i changed the Subject line.
> But It
> change another behaviour, it expect an index.xml when you define:
>
> <samples label="Samples" href="samples/" tab="samples">
>
> X [0] samples/index.html BROKEN:
> /var/tmp/fs/build/tmp/context/content/xdocs/samples/index.xml (No such
> file or directory)
Yes i noticed this recent change too. One of our projects had
site.xml entries like Juan shows above, ending in a directory slash
with no explicit reference to index.html
It built fine and generated an index.html in the correct place
using the relevant index.xml source.
Recently that changed and i had to use explicit index.html hrefs.
--David
CLI: Handling links to directories (Re: broken build when href=directory)
Posted by Jeff Turner <je...@apache.org>.
(moving to dev@cocoon)
On Thu, Sep 04, 2003 at 12:24:48PM +0100, Upayavira wrote:
> Juan Jose Pablos wrote:
>
> >David Crossley wrote:
> >
> >>
> >>Yes i noticed this recent change too. One of our projects had
> >>site.xml entries like Juan shows above, ending in a directory slash
> >>with no explicit reference to index.html
> >>
> >
> >So is this the expect behaviour?, when there is a ending slash a
> >welcome.file "index.html" would be used?
> >
> >Cheers,
> >Cheche
>
> Don't entirely understand what you guys are discussing, but the CLI
> should, if it comes across a link ending in a slash, it should append
> 'index.html' to that. You can change the filename that is appended using
> the <default-filename/> node in the cli.xconf, but I believe it will
> default to 'index.html' if one is not specified (the actual default
> value comes from an entry in org.apache.cocoon.Constants).
>
> This behaviour should not have changed at all since the xconf format was
> first created.
Possibly it does append index.html, but it still breaks:
java.lang.NullPointerException
at org.apache.cocoon.environment.AbstractEnvironment.release(AbstractEnvironment.java:521)
at org.apache.cocoon.environment.wrapper.MutableEnvironmentFacade.release(MutableEnvironmentFacade.java:332)
at org.apache.cocoon.generation.FileGenerator.recycle(FileGenerator.java:90)
at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:323)
at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.recycle(AbstractProcessingPipeline.java:641)
at org.apache.cocoon.components.pipeline.impl.BaseCachingProcessingPipeline.recycle(BaseCachingProcessingPipeline.java:112)
at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.recycle(AbstractCachingProcessingPipeline.java:961)
at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
at org.apache.cocoon.components.EnvironmentDescription.release(CocoonComponentManager.java:602)
at org.apache.cocoon.components.CocoonComponentManager.endProcessing(CocoonComponentManager.java:212)
at org.apache.cocoon.Cocoon.process(Cocoon.java:659)
at org.apache.cocoon.bean.CocoonWrapper.getPage(CocoonWrapper.java:551)
at org.apache.cocoon.bean.CocoonBean.processTarget(CocoonBean.java:477)
at org.apache.cocoon.bean.CocoonBean.process(CocoonBean.java:294)
at org.apache.cocoon.Main.main(Main.java:392)
....
....
* [0] community/
* [49] forrestbot-intro.html
That is with a <link href="community/"> in index.html. There is a
community/index.xml file.
Whether to append anything to directory links is debatable. It feels
very similar to the question of whether to munge page extensions.
Perhaps we could solve the problem in the same way, by declaring that
links of the form '**/' have a MIME type 'text/directory' [1], and then
having an entry in mime.types to specify a default filename, if any:
text/directory index.html
Or perhaps a <mime-types> section in cli.xconf?
Anyway, just random thoughts. The problem is very easy to work around
now we have wildcard exclusions, with <exclude pattern="**/"/>
--Jeff
[1] http://www.faqs.org/rfcs/rfc2425.html
> Hope this is relevant.
>
> Upayavira
>
>
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Upayavira <uv...@upaya.co.uk>.
Jeff Turner wrote:
>On Thu, Sep 04, 2003 at 01:37:21PM +0200, Juan Jose Pablos wrote:
>
>
>>Upayavira wrote:
>>
>>
>>>Don't entirely understand what you guys are discussing, but the CLI
>>>should, if it comes across a link ending in a slash, it should append
>>>'index.html' to that. You can change the filename that is appended using
>>>the <default-filename/> node in the cli.xconf, but I believe it will
>>>default to 'index.html' if one is not specified (the actual default
>>>value comes from an entry in org.apache.cocoon.Constants).
>>>
>>>This behaviour should not have changed at all since the xconf format was
>>>first created.
>>>
>>>
>>The behaviour has changed over the last few weeks, last time it change
>>was when you guys fixed HTMLGenerator.
>>
>>
>
>Are you sure? I experience the problem with the version of Cocoon
>currently in Forrest. I think it existed with the old CLI too. We
>filtered out **/ links in filterlinks.xsl, and now do the same in
>cli.xconf.
>
>
Ah, I'm (a little) relieved. I could not see any recent changes that
could have caused this.
Now off to continue on cocoon-dev.
Upayavira
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Juan Jose Pablos <ch...@che-che.com>.
Jeff,
>>The behaviour has changed over the last few weeks, last time it change
>>was when you guys fixed HTMLGenerator.
>
>
> Are you sure? I experience the problem with the version of Cocoon
> currently in Forrest. I think it existed with the old CLI too. We
> filtered out **/ links in filterlinks.xsl, and now do the same in
> cli.xconf.
With 20030831 I did not get that error, yesterday I upgraded and I got
that "file not found error".
Cheers,
Cheche
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Jeff Turner <je...@apache.org>.
On Thu, Sep 04, 2003 at 01:37:21PM +0200, Juan Jose Pablos wrote:
> Upayavira wrote:
> >
> >Don't entirely understand what you guys are discussing, but the CLI
> >should, if it comes across a link ending in a slash, it should append
> >'index.html' to that. You can change the filename that is appended using
> >the <default-filename/> node in the cli.xconf, but I believe it will
> >default to 'index.html' if one is not specified (the actual default
> >value comes from an entry in org.apache.cocoon.Constants).
> >
> >This behaviour should not have changed at all since the xconf format was
> >first created.
>
> The behaviour has changed over the last few weeks, last time it change
> was when you guys fixed HTMLGenerator.
Are you sure? I experience the problem with the version of Cocoon
currently in Forrest. I think it existed with the old CLI too. We
filtered out **/ links in filterlinks.xsl, and now do the same in
cli.xconf.
--Jeff
> I happy with any behaviour as long as there is a consensus.
>
> >
> >Hope this is relevant.
> >
> >Upayavira
> >
>
> It is!
>
> Cheers,
> Cheche
>
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Juan Jose Pablos <ch...@che-che.com>.
Upayavira wrote:
>
> Don't entirely understand what you guys are discussing, but the CLI
> should, if it comes across a link ending in a slash, it should append
> 'index.html' to that. You can change the filename that is appended using
> the <default-filename/> node in the cli.xconf, but I believe it will
> default to 'index.html' if one is not specified (the actual default
> value comes from an entry in org.apache.cocoon.Constants).
>
> This behaviour should not have changed at all since the xconf format was
> first created.
The behaviour has changed over the last few weeks, last time it change
was when you guys fixed HTMLGenerator.
I happy with any behaviour as long as there is a consensus.
>
> Hope this is relevant.
>
> Upayavira
>
It is!
Cheers,
Cheche
CLI: Handling links to directories (Re: broken build when href=directory)
Posted by Jeff Turner <je...@apache.org>.
(moving to dev@cocoon)
On Thu, Sep 04, 2003 at 12:24:48PM +0100, Upayavira wrote:
> Juan Jose Pablos wrote:
>
> >David Crossley wrote:
> >
> >>
> >>Yes i noticed this recent change too. One of our projects had
> >>site.xml entries like Juan shows above, ending in a directory slash
> >>with no explicit reference to index.html
> >>
> >
> >So is this the expect behaviour?, when there is a ending slash a
> >welcome.file "index.html" would be used?
> >
> >Cheers,
> >Cheche
>
> Don't entirely understand what you guys are discussing, but the CLI
> should, if it comes across a link ending in a slash, it should append
> 'index.html' to that. You can change the filename that is appended using
> the <default-filename/> node in the cli.xconf, but I believe it will
> default to 'index.html' if one is not specified (the actual default
> value comes from an entry in org.apache.cocoon.Constants).
>
> This behaviour should not have changed at all since the xconf format was
> first created.
Possibly it does append index.html, but it still breaks:
java.lang.NullPointerException
at org.apache.cocoon.environment.AbstractEnvironment.release(AbstractEnvironment.java:521)
at org.apache.cocoon.environment.wrapper.MutableEnvironmentFacade.release(MutableEnvironmentFacade.java:332)
at org.apache.cocoon.generation.FileGenerator.recycle(FileGenerator.java:90)
at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:323)
at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.recycle(AbstractProcessingPipeline.java:641)
at org.apache.cocoon.components.pipeline.impl.BaseCachingProcessingPipeline.recycle(BaseCachingProcessingPipeline.java:112)
at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.recycle(AbstractCachingProcessingPipeline.java:961)
at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
at org.apache.cocoon.components.EnvironmentDescription.release(CocoonComponentManager.java:602)
at org.apache.cocoon.components.CocoonComponentManager.endProcessing(CocoonComponentManager.java:212)
at org.apache.cocoon.Cocoon.process(Cocoon.java:659)
at org.apache.cocoon.bean.CocoonWrapper.getPage(CocoonWrapper.java:551)
at org.apache.cocoon.bean.CocoonBean.processTarget(CocoonBean.java:477)
at org.apache.cocoon.bean.CocoonBean.process(CocoonBean.java:294)
at org.apache.cocoon.Main.main(Main.java:392)
....
....
* [0] community/
* [49] forrestbot-intro.html
That is with a <link href="community/"> in index.html. There is a
community/index.xml file.
Whether to append anything to directory links is debatable. It feels
very similar to the question of whether to munge page extensions.
Perhaps we could solve the problem in the same way, by declaring that
links of the form '**/' have a MIME type 'text/directory' [1], and then
having an entry in mime.types to specify a default filename, if any:
text/directory index.html
Or perhaps a <mime-types> section in cli.xconf?
Anyway, just random thoughts. The problem is very easy to work around
now we have wildcard exclusions, with <exclude pattern="**/"/>
--Jeff
[1] http://www.faqs.org/rfcs/rfc2425.html
> Hope this is relevant.
>
> Upayavira
>
>
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Upayavira <uv...@upaya.co.uk>.
Juan Jose Pablos wrote:
> David Crossley wrote:
>
>>
>> Yes i noticed this recent change too. One of our projects had
>> site.xml entries like Juan shows above, ending in a directory slash
>> with no explicit reference to index.html
>>
>
> So is this the expect behaviour?, when there is a ending slash a
> welcome.file "index.html" would be used?
>
> Cheers,
> Cheche
Don't entirely understand what you guys are discussing, but the CLI
should, if it comes across a link ending in a slash, it should append
'index.html' to that. You can change the filename that is appended using
the <default-filename/> node in the cli.xconf, but I believe it will
default to 'index.html' if one is not specified (the actual default
value comes from an entry in org.apache.cocoon.Constants).
This behaviour should not have changed at all since the xconf format was
first created.
Hope this is relevant.
Upayavira
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Jeff Turner <je...@apache.org>.
On Thu, Sep 04, 2003 at 12:51:22PM +0100, Upayavira wrote:
> Jeff Turner wrote:
> >On Thu, Sep 04, 2003 at 01:22:54PM +0200, Juan Jose Pablos wrote:
> >>David Crossley wrote:
> >>>Yes i noticed this recent change too. One of our projects had
> >>>site.xml entries like Juan shows above, ending in a directory slash
> >>>with no explicit reference to index.html
> >>>
> >>So is this the expect behaviour?, when there is a ending slash a
> >>welcome.file "index.html" would be used?
> >
> >IMHO it's not up to Forrest. We just emit the link as-is.
> >
> Yes. If a user puts a URL into a page ending in a slash, that stays as
> is. But with just a slash, how do you deal with crawling? You can get
> the content from blah/, but what filename do you use to save that
> content? The behaviour I'm referring to changes the filename to
> blah/index.html when saving it. But it doesn't do any rewriting of the
> filenames or anything like that.
Oh I see. I hadn't thought of link rewriting and link following as
separate operations.
> >I've added an <exclude pattern="**/"/> to the default cli.xconf to fix
> >the problem in Forrest.
> >
> >
> That's unfortunate really as it could confuse some people as to why huge
> sections of their site are excluded.
True, although its always been like this in Forrest and AFAIR no-one has
complained. Thanks for the info!
--Jeff
> (But until bug is fixed, unfortunately necessary).
>
> Regards, Upayavira
>
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Upayavira <uv...@upaya.co.uk>.
Jeff Turner wrote:
>On Thu, Sep 04, 2003 at 01:22:54PM +0200, Juan Jose Pablos wrote:
>
>
>>David Crossley wrote:
>>
>>
>>>Yes i noticed this recent change too. One of our projects had
>>>site.xml entries like Juan shows above, ending in a directory slash
>>>with no explicit reference to index.html
>>>
>>>
>>>
>>So is this the expect behaviour?, when there is a ending slash a
>>welcome.file "index.html" would be used?
>>
>>
>
>IMHO it's not up to Forrest. We just emit the link as-is.
>
Yes. If a user puts a URL into a page ending in a slash, that stays as
is. But with just a slash, how do you deal with crawling? You can get
the content from blah/, but what filename do you use to save that
content? The behaviour I'm referring to changes the filename to
blah/index.html when saving it. But it doesn't do any rewriting of the
filenames or anything like that.
>I've added an <exclude pattern="**/"/> to the default cli.xconf to fix
>the problem in Forrest.
>
>
That's unfortunate really as it could confuse some people as to why huge
sections of their site are excluded.
(But until bug is fixed, unfortunately necessary).
Regards, Upayavira
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Jeff Turner <je...@apache.org>.
On Thu, Sep 04, 2003 at 01:22:54PM +0200, Juan Jose Pablos wrote:
> David Crossley wrote:
> >
> >Yes i noticed this recent change too. One of our projects had
> >site.xml entries like Juan shows above, ending in a directory slash
> >with no explicit reference to index.html
> >
>
> So is this the expect behaviour?, when there is a ending slash a
> welcome.file "index.html" would be used?
IMHO it's not up to Forrest. We just emit the link as-is.
I've added an <exclude pattern="**/"/> to the default cli.xconf to fix
the problem in Forrest.
--Jeff
> Cheers,
> Cheche
>
Re: broken build when href=directory (Was: IHTML sample bug)
Posted by Juan Jose Pablos <ch...@che-che.com>.
David Crossley wrote:
>
> Yes i noticed this recent change too. One of our projects had
> site.xml entries like Juan shows above, ending in a directory slash
> with no explicit reference to index.html
>
So is this the expect behaviour?, when there is a ending slash a
welcome.file "index.html" would be used?
Cheers,
Cheche