You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by David Crossley <cr...@indexgeo.com.au> on 2003/09/04 09:04:27 UTC

broken build when href=directory (Was: IHTML sample bug)

Juan Jose Pablos wrote:
> Upayavira wrote:
> > 
> > Does the ihtml sample use the HTMLGenerator?
> > 
> > Carsten has just fixed a CLI bug in the HTMLGenerator that prevented it 
> > working without an HTTPEnvironment present. So if someone is able to 
> > upgrade to CVS Cocoon and retest the IHTML sample, it might well work.
> 
> I upgrade cocoon on my system, and that seems to fix this issue.

Great, but i think that this second thing is a different issue.
Hence i changed the Subject line.

> But It 
> change another behaviour, it expect an index.xml when you define:
> 
> <samples label="Samples" href="samples/" tab="samples">
> 
> X [0] samples/index.html	BROKEN: 
> /var/tmp/fs/build/tmp/context/content/xdocs/samples/index.xml (No such 
> file or directory)

Yes i noticed this recent change too. One of our projects had
site.xml entries like Juan shows above, ending in a directory slash
with no explicit reference to index.html

It built fine and generated an index.html in the correct place
using the relevant index.xml source.

Recently that changed and i had to use explicit index.html hrefs.

--David



CLI: Handling links to directories (Re: broken build when href=directory)

Posted by Jeff Turner <je...@apache.org>.
(moving to dev@cocoon)

On Thu, Sep 04, 2003 at 12:24:48PM +0100, Upayavira wrote:
> Juan Jose Pablos wrote:
> 
> >David Crossley wrote:
> >
> >>
> >>Yes i noticed this recent change too. One of our projects had
> >>site.xml entries like Juan shows above, ending in a directory slash
> >>with no explicit reference to index.html
> >>
> >
> >So is this the expect behaviour?, when there is a ending slash a 
> >welcome.file "index.html" would be used?
> >
> >Cheers,
> >Cheche
> 
> Don't entirely understand what you guys are discussing, but the CLI 
> should, if it comes across a link ending in a slash, it should append 
> 'index.html' to that. You can change the filename that is appended using 
> the <default-filename/> node in the cli.xconf, but I believe it will 
> default to 'index.html' if one is not specified (the actual default 
> value comes from an entry in org.apache.cocoon.Constants).
> 
> This behaviour should not have changed at all since the xconf format was 
> first created.

Possibly it does append index.html, but it still breaks:

java.lang.NullPointerException
        at org.apache.cocoon.environment.AbstractEnvironment.release(AbstractEnvironment.java:521)
        at org.apache.cocoon.environment.wrapper.MutableEnvironmentFacade.release(MutableEnvironmentFacade.java:332)
        at org.apache.cocoon.generation.FileGenerator.recycle(FileGenerator.java:90)
        at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
        at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
        at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
        at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
        at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
        at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:323)
        at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.recycle(AbstractProcessingPipeline.java:641)
        at org.apache.cocoon.components.pipeline.impl.BaseCachingProcessingPipeline.recycle(BaseCachingProcessingPipeline.java:112)
        at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.recycle(AbstractCachingProcessingPipeline.java:961)
        at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
        at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
        at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
        at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
        at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
        at org.apache.cocoon.components.EnvironmentDescription.release(CocoonComponentManager.java:602)
        at org.apache.cocoon.components.CocoonComponentManager.endProcessing(CocoonComponentManager.java:212)
        at org.apache.cocoon.Cocoon.process(Cocoon.java:659)
        at org.apache.cocoon.bean.CocoonWrapper.getPage(CocoonWrapper.java:551)
        at org.apache.cocoon.bean.CocoonBean.processTarget(CocoonBean.java:477)
        at org.apache.cocoon.bean.CocoonBean.process(CocoonBean.java:294)
        at org.apache.cocoon.Main.main(Main.java:392)
....
....
* [0] community/
* [49] forrestbot-intro.html

That is with a <link href="community/"> in index.html.  There is a
community/index.xml file.

Whether to append anything to directory links is debatable.  It feels
very similar to the question of whether to munge page extensions.
Perhaps we could solve the problem in the same way, by declaring that
links of the form '**/' have a MIME type 'text/directory' [1], and then
having an entry in mime.types to specify a default filename, if any:

text/directory  index.html

Or perhaps a <mime-types> section in cli.xconf?

Anyway, just random thoughts.  The problem is very easy to work around
now we have wildcard exclusions, with <exclude pattern="**/"/>


--Jeff

[1] http://www.faqs.org/rfcs/rfc2425.html

> Hope this is relevant.
> 
> Upayavira
> 
> 

Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Upayavira <uv...@upaya.co.uk>.
Jeff Turner wrote:

>On Thu, Sep 04, 2003 at 01:37:21PM +0200, Juan Jose Pablos wrote:
>  
>
>>Upayavira wrote:
>>    
>>
>>>Don't entirely understand what you guys are discussing, but the CLI 
>>>should, if it comes across a link ending in a slash, it should append 
>>>'index.html' to that. You can change the filename that is appended using 
>>>the <default-filename/> node in the cli.xconf, but I believe it will 
>>>default to 'index.html' if one is not specified (the actual default 
>>>value comes from an entry in org.apache.cocoon.Constants).
>>>
>>>This behaviour should not have changed at all since the xconf format was 
>>>first created.
>>>      
>>>
>>The behaviour has changed over the last few weeks, last time it change 
>>was when you guys fixed HTMLGenerator.
>>    
>>
>
>Are you sure?  I experience the problem with the version of Cocoon
>currently in Forrest.  I think it existed with the old CLI too.  We
>filtered out **/ links in filterlinks.xsl, and now do the same in
>cli.xconf.
>  
>
Ah, I'm (a little) relieved. I could not see any recent changes that 
could have caused this.

Now off to continue on cocoon-dev.

Upayavira



Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Juan Jose Pablos <ch...@che-che.com>.
Jeff,


>>The behaviour has changed over the last few weeks, last time it change 
>>was when you guys fixed HTMLGenerator.
> 
> 
> Are you sure?  I experience the problem with the version of Cocoon
> currently in Forrest.  I think it existed with the old CLI too.  We
> filtered out **/ links in filterlinks.xsl, and now do the same in
> cli.xconf.

With 20030831 I did not get that error, yesterday I upgraded and I got 
that "file not found error".

Cheers,
Cheche





Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Jeff Turner <je...@apache.org>.
On Thu, Sep 04, 2003 at 01:37:21PM +0200, Juan Jose Pablos wrote:
> Upayavira wrote:
> >
> >Don't entirely understand what you guys are discussing, but the CLI 
> >should, if it comes across a link ending in a slash, it should append 
> >'index.html' to that. You can change the filename that is appended using 
> >the <default-filename/> node in the cli.xconf, but I believe it will 
> >default to 'index.html' if one is not specified (the actual default 
> >value comes from an entry in org.apache.cocoon.Constants).
> >
> >This behaviour should not have changed at all since the xconf format was 
> >first created.
> 
> The behaviour has changed over the last few weeks, last time it change 
> was when you guys fixed HTMLGenerator.

Are you sure?  I experience the problem with the version of Cocoon
currently in Forrest.  I think it existed with the old CLI too.  We
filtered out **/ links in filterlinks.xsl, and now do the same in
cli.xconf.

--Jeff

> I happy with any behaviour as long as there is a consensus.
> 
> >
> >Hope this is relevant.
> >
> >Upayavira
> >
> 
> It is!
> 
> Cheers,
> Cheche
> 

Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Juan Jose Pablos <ch...@che-che.com>.
Upayavira wrote:
> 
> Don't entirely understand what you guys are discussing, but the CLI 
> should, if it comes across a link ending in a slash, it should append 
> 'index.html' to that. You can change the filename that is appended using 
> the <default-filename/> node in the cli.xconf, but I believe it will 
> default to 'index.html' if one is not specified (the actual default 
> value comes from an entry in org.apache.cocoon.Constants).
> 
> This behaviour should not have changed at all since the xconf format was 
> first created.

The behaviour has changed over the last few weeks, last time it change 
was when you guys fixed HTMLGenerator.

I happy with any behaviour as long as there is a consensus.

> 
> Hope this is relevant.
> 
> Upayavira
> 

It is!

Cheers,
Cheche


CLI: Handling links to directories (Re: broken build when href=directory)

Posted by Jeff Turner <je...@apache.org>.
(moving to dev@cocoon)

On Thu, Sep 04, 2003 at 12:24:48PM +0100, Upayavira wrote:
> Juan Jose Pablos wrote:
> 
> >David Crossley wrote:
> >
> >>
> >>Yes i noticed this recent change too. One of our projects had
> >>site.xml entries like Juan shows above, ending in a directory slash
> >>with no explicit reference to index.html
> >>
> >
> >So is this the expect behaviour?, when there is a ending slash a 
> >welcome.file "index.html" would be used?
> >
> >Cheers,
> >Cheche
> 
> Don't entirely understand what you guys are discussing, but the CLI 
> should, if it comes across a link ending in a slash, it should append 
> 'index.html' to that. You can change the filename that is appended using 
> the <default-filename/> node in the cli.xconf, but I believe it will 
> default to 'index.html' if one is not specified (the actual default 
> value comes from an entry in org.apache.cocoon.Constants).
> 
> This behaviour should not have changed at all since the xconf format was 
> first created.

Possibly it does append index.html, but it still breaks:

java.lang.NullPointerException
        at org.apache.cocoon.environment.AbstractEnvironment.release(AbstractEnvironment.java:521)
        at org.apache.cocoon.environment.wrapper.MutableEnvironmentFacade.release(MutableEnvironmentFacade.java:332)
        at org.apache.cocoon.generation.FileGenerator.recycle(FileGenerator.java:90)
        at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
        at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
        at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
        at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
        at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
        at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:323)
        at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.recycle(AbstractProcessingPipeline.java:641)
        at org.apache.cocoon.components.pipeline.impl.BaseCachingProcessingPipeline.recycle(BaseCachingProcessingPipeline.java:112)
        at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.recycle(AbstractCachingProcessingPipeline.java:961)
        at org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
        at org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
        at org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
        at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
        at org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:326)
        at org.apache.cocoon.components.EnvironmentDescription.release(CocoonComponentManager.java:602)
        at org.apache.cocoon.components.CocoonComponentManager.endProcessing(CocoonComponentManager.java:212)
        at org.apache.cocoon.Cocoon.process(Cocoon.java:659)
        at org.apache.cocoon.bean.CocoonWrapper.getPage(CocoonWrapper.java:551)
        at org.apache.cocoon.bean.CocoonBean.processTarget(CocoonBean.java:477)
        at org.apache.cocoon.bean.CocoonBean.process(CocoonBean.java:294)
        at org.apache.cocoon.Main.main(Main.java:392)
....
....
* [0] community/
* [49] forrestbot-intro.html

That is with a <link href="community/"> in index.html.  There is a
community/index.xml file.

Whether to append anything to directory links is debatable.  It feels
very similar to the question of whether to munge page extensions.
Perhaps we could solve the problem in the same way, by declaring that
links of the form '**/' have a MIME type 'text/directory' [1], and then
having an entry in mime.types to specify a default filename, if any:

text/directory  index.html

Or perhaps a <mime-types> section in cli.xconf?

Anyway, just random thoughts.  The problem is very easy to work around
now we have wildcard exclusions, with <exclude pattern="**/"/>


--Jeff

[1] http://www.faqs.org/rfcs/rfc2425.html

> Hope this is relevant.
> 
> Upayavira
> 
> 

Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Upayavira <uv...@upaya.co.uk>.
Juan Jose Pablos wrote:

> David Crossley wrote:
>
>>
>> Yes i noticed this recent change too. One of our projects had
>> site.xml entries like Juan shows above, ending in a directory slash
>> with no explicit reference to index.html
>>
>
> So is this the expect behaviour?, when there is a ending slash a 
> welcome.file "index.html" would be used?
>
> Cheers,
> Cheche

Don't entirely understand what you guys are discussing, but the CLI 
should, if it comes across a link ending in a slash, it should append 
'index.html' to that. You can change the filename that is appended using 
the <default-filename/> node in the cli.xconf, but I believe it will 
default to 'index.html' if one is not specified (the actual default 
value comes from an entry in org.apache.cocoon.Constants).

This behaviour should not have changed at all since the xconf format was 
first created.

Hope this is relevant.

Upayavira



Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Jeff Turner <je...@apache.org>.
On Thu, Sep 04, 2003 at 12:51:22PM +0100, Upayavira wrote:
> Jeff Turner wrote:
> >On Thu, Sep 04, 2003 at 01:22:54PM +0200, Juan Jose Pablos wrote:
> >>David Crossley wrote:
> >>>Yes i noticed this recent change too. One of our projects had
> >>>site.xml entries like Juan shows above, ending in a directory slash
> >>>with no explicit reference to index.html
> >>>
> >>So is this the expect behaviour?, when there is a ending slash a 
> >>welcome.file "index.html" would be used?
> >
> >IMHO it's not up to Forrest.  We just emit the link as-is.
> >
> Yes. If a user puts a URL into a page ending in a slash, that stays as 
> is. But with just a slash, how do you deal with crawling? You can get 
> the content from blah/, but what filename do you use to save that 
> content? The behaviour I'm referring to changes the filename to 
> blah/index.html when saving it. But it doesn't do any rewriting of the 
> filenames or anything like that.

Oh I see.  I hadn't thought of link rewriting and link following as
separate operations.

> >I've added an <exclude pattern="**/"/> to the default cli.xconf to fix
> >the problem in Forrest.
> > 
> >
> That's unfortunate really as it could confuse some people as to why huge 
> sections of their site are excluded.

True, although its always been like this in Forrest and AFAIR no-one has
complained.  Thanks for the info!

--Jeff

> (But until bug is fixed, unfortunately necessary).
> 
> Regards, Upayavira
> 

Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Upayavira <uv...@upaya.co.uk>.
Jeff Turner wrote:

>On Thu, Sep 04, 2003 at 01:22:54PM +0200, Juan Jose Pablos wrote:
>  
>
>>David Crossley wrote:
>>    
>>
>>>Yes i noticed this recent change too. One of our projects had
>>>site.xml entries like Juan shows above, ending in a directory slash
>>>with no explicit reference to index.html
>>>
>>>      
>>>
>>So is this the expect behaviour?, when there is a ending slash a 
>>welcome.file "index.html" would be used?
>>    
>>
>
>IMHO it's not up to Forrest.  We just emit the link as-is.
>
Yes. If a user puts a URL into a page ending in a slash, that stays as 
is. But with just a slash, how do you deal with crawling? You can get 
the content from blah/, but what filename do you use to save that 
content? The behaviour I'm referring to changes the filename to 
blah/index.html when saving it. But it doesn't do any rewriting of the 
filenames or anything like that.

>I've added an <exclude pattern="**/"/> to the default cli.xconf to fix
>the problem in Forrest.
>  
>
That's unfortunate really as it could confuse some people as to why huge 
sections of their site are excluded.

(But until bug is fixed, unfortunately necessary).

Regards, Upayavira


Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Jeff Turner <je...@apache.org>.
On Thu, Sep 04, 2003 at 01:22:54PM +0200, Juan Jose Pablos wrote:
> David Crossley wrote:
> >
> >Yes i noticed this recent change too. One of our projects had
> >site.xml entries like Juan shows above, ending in a directory slash
> >with no explicit reference to index.html
> >
> 
> So is this the expect behaviour?, when there is a ending slash a 
> welcome.file "index.html" would be used?

IMHO it's not up to Forrest.  We just emit the link as-is.

I've added an <exclude pattern="**/"/> to the default cli.xconf to fix
the problem in Forrest.

--Jeff

> Cheers,
> Cheche
> 

Re: broken build when href=directory (Was: IHTML sample bug)

Posted by Juan Jose Pablos <ch...@che-che.com>.
David Crossley wrote:
> 
> Yes i noticed this recent change too. One of our projects had
> site.xml entries like Juan shows above, ending in a directory slash
> with no explicit reference to index.html
> 

So is this the expect behaviour?, when there is a ending slash a 
welcome.file "index.html" would be used?

Cheers,
Cheche