You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by David Crossley <cr...@apache.org> on 2007/04/05 10:05:05 UTC

[headsup] recent change in broken link handling

The handling of broken site: links has recently changed.

It used to report the link as BROKEN in the site build
and write its details to the broken-links.xml file.

Now it generates a file called error_site_$linkname
with the "No pipeline matched request:" info inside it.

I wonder if it is related to the recent change in
error handling for the locationmap.

-David

> Author: crossley
> Date: Thu Apr  5 00:40:14 2007
> New Revision: 525747
> 
> URL: http://svn.apache.org/viewvc?view=rev&rev=525747
> Log:
> Automatic publish from forrestbot
> 
> Added:
>     forrest/site/docs_0_80/error_site_your-project/
>     forrest/site/docs_0_80/error_site_your-project/catalog
...
...

Re: [headsup] recent change in broken link handling

Posted by David Crossley <cr...@apache.org>.
Ross Gardler wrote:
> Thorsten Scherler wrote:
> >
> ><map:when test="resourceNotFound">
> >
> >That is the puppy. I ran into the broken link issue the other day and
> >was surprised that my build was successful with 0 pages build.
> >
> >Actually the resourceNotFound is the only exception we do not want to
> >catch (see broken links). 
> >
> >Meaning as quick fix to not loose the whole error handling nor broken
> >links we need to catch all exception *except* resourceNotFound and
> >change the resource not found exception in the lm to a specific
> >locationMap exception. 
> >
> >Than we get back the broken links and have a "nicer" error handling.
> >
> >wdyt?
> 
> Unfortunately, it's not that simple. We can't spot when the locationmap 
> returns null to the sitemap hence we can't trap this and throw a 
> different exception from the locationmap. At least not with the current 
> implementation.
> 
> The following is from memory of how Tim explained things - can't find 
> the mails in the archive.
> 
> The locationmap returns a null if no entry is found. But the way the 
> locationmap mounting works we have no way of knowing if there is another 
> map to check after the current one. Hence we don't know when a null from 
> a single locationmap means a null from all locationmaps.

Perhaps at the beginning of the next stage of processing
could something test if the reference is still unresolved.

> Now, I'm sure this can be worked around. But it is a bank holiday here 
> in the UK so don't expect me to find loads of time to address this, it's 
> a family weekend. I post this for information in case someone else has time.

Yeah this is a bit of a worry for the Release Plan.
I was a little concerned about the Easter holidays overlap.
Nobody expressed concern during the vote.

Hopefully for some others it is a chance to sneak off
to contribute a bit.

On the other hand we should have people's attention
next week and weekend.

-David

Re: [headsup] recent change in broken link handling

Posted by Ross Gardler <rg...@apache.org>.
Thorsten Scherler wrote:
> On Fri, 2007-04-06 at 21:51 +1000, David Crossley wrote:
>> Ross Gardler wrote:
>>> David Crossley wrote:
>>>> The handling of broken site: links has recently changed.
>>>>
>>>> It used to report the link as BROKEN in the site build
>>>> and write its details to the broken-links.xml file.
>>>>
>>>> Now it generates a file called error_site_$linkname
>>>> with the "No pipeline matched request:" info inside it.
>>>>
>>>> I wonder if it is related to the recent change in
>>>> error handling for the locationmap.
>>> I think it probably is a side effect of the error handling. And not (in 
>>> its current form) a good one.
>> Note for the archives: See FOR-701 and r521063.
> 
> Thanks for this info David!
> ...
> <map:when test="resourceNotFound">
> 
> That is the puppy. I ran into the broken link issue the other day and
> was surprised that my build was successful with 0 pages build.
> 
> Actually the resourceNotFound is the only exception we do not want to
> catch (see broken links). 
> 
> Meaning as quick fix to not loose the whole error handling nor broken
> links we need to catch all exception *except* resourceNotFound and
> change the resource not found exception in the lm to a specific
> locationMap exception. 
> 
> Than we get back the broken links and have a "nicer" error handling.
> 
> wdyt?

Unfortunately, it's not that simple. We can't spot when the locationmap 
returns null to the sitemap hence we can't trap this and throw a 
different exception from the locationmap. At least not with the current 
implementation.

The following is from memory of how Tim explained things - can't find 
the mails in the archive.

The locationmap returns a null if no entry is found. But the way the 
locationmap mounting works we have no way of knowing if there is another 
map to check after the current one. Hence we don't know when a null from 
a single locationmap means a null from all locationmaps.

Now, I'm sure this can be worked around. But it is a bank holiday here 
in the UK so don't expect me to find loads of time to address this, it's 
a family weekend. I post this for information in case someone else has time.

Ross

Re: [headsup] recent change in broken link handling

Posted by Thorsten Scherler <th...@apache.org>.
On Fri, 2007-04-06 at 21:51 +1000, David Crossley wrote:
> Ross Gardler wrote:
> > David Crossley wrote:
> > >The handling of broken site: links has recently changed.
> > >
> > >It used to report the link as BROKEN in the site build
> > >and write its details to the broken-links.xml file.
> > >
> > >Now it generates a file called error_site_$linkname
> > >with the "No pipeline matched request:" info inside it.
> > >
> > >I wonder if it is related to the recent change in
> > >error handling for the locationmap.
> > 
> > I think it probably is a side effect of the error handling. And not (in 
> > its current form) a good one.
> 
> Note for the archives: See FOR-701 and r521063.

Thanks for this info David!
...
<map:when test="resourceNotFound">

That is the puppy. I ran into the broken link issue the other day and
was surprised that my build was successful with 0 pages build.

Actually the resourceNotFound is the only exception we do not want to
catch (see broken links). 

Meaning as quick fix to not loose the whole error handling nor broken
links we need to catch all exception *except* resourceNotFound and
change the resource not found exception in the lm to a specific
locationMap exception. 

Than we get back the broken links and have a "nicer" error handling.

wdyt?

salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


Re: [headsup] recent change in broken link handling

Posted by Ferdinand Soethe <fe...@apache.org>.
For my projects the broken-links-file has often been important in
identifying the problem. I'd be sad to see it become useless.

Ferdinand

Re: [headsup] recent change in broken link handling

Posted by David Crossley <cr...@apache.org>.
Ross Gardler wrote:
> David Crossley wrote:
> >Ross Gardler wrote:
> 
> ...
> 
> >>1) revert the error handling which will give us meaningless locationmap 
> >>errors, but will give a meaningful broken-links.xml file
> >>
> >>2) Improve the generated error page to be more user friendly and show 
> >>the error in a graceful way, this will make the user experience better 
> >>if a broken site is deployed, but will make the admins job harder
> >>
> >>3) A mix of 1) and 2) in which we generate a nice user error page and we 
> >>use the SourceWritingTransformer to provide meaningful output for the 
> >>admin.
> >>
> >>4) ?
> >>
> >>Thoughts?
> >
> >Dunno.
> >
> >Better locationmap error messages are more important,
> >because that can be complex to debug.
> 
> As others have noted, I personally feel that broken-links.xml is more 
> important. The locationmap can be debugged if you know what you are 
> doing. So this will become a user support issue until we can patch the 
> locationmap (see my reply to Thorsten).

It frightens me that in such a small community
we are creating a user support issue.

The FAQ/Tutorial suggested below will be very important.

> >Perhaps a way out for now is to describe the issue
> >in Jira, add a note to the upgrading_08.html doc, and
> >add to the end of the "site" ant target to scan the
> >build directory to list any error_site_* files.
> >
> >That is not as good as the previous broken link handling
> >because it doesn't list the referring pages.
> 
> Which, for me is very important.

Definitely. We would need to find another way to do it.

> I've just commented out the relevant match in our root sitemap to 
> restore the old behaviour before the code freeze. I think we ought to 
> document the bad locationmap error message in the upgrading doc, along 
> with a hint on debugging via the locationmap logs (set the log level to 
> debug in logkit.xml), and a pointer to the mail lists for assistance. 
> (for now I've put a one line warning, better than nothing but needs 
> improving since reading the locationmap logs is not easy - I'm going to 
> bed now, sorry)

Would someone who knows the locationmap please
add a decent FAQ and a note to upgrading_08.html
and a Jira Issue.

> If someone manages to change the way the locationmap works and get it to 
> throw a proper error before the code freeze then we can reinstate the 
> error trapping. Otherwise we'll do it in the next release (which will 
> not be so long in the making - we can discuss that after the release).

There is a bit under three days to go.

-David

Re: [headsup] recent change in broken link handling

Posted by Ross Gardler <rg...@apache.org>.
David Crossley wrote:
> Ross Gardler wrote:

...

>> 1) revert the error handling which will give us meaningless locationmap 
>> errors, but will give a meaningful broken-links.xml file
>>
>> 2) Improve the generated error page to be more user friendly and show 
>> the error in a graceful way, this will make the user experience better 
>> if a broken site is deployed, but will make the admins job harder
>>
>> 3) A mix of 1) and 2) in which we generate a nice user error page and we 
>> use the SourceWritingTransformer to provide meaningful output for the admin.
>>
>> 4) ?
>>
>> Thoughts?
> 
> Dunno.
> 
> Better locationmap error messages are more important,
> because that can be complex to debug.

As others have noted, I personally feel that broken-links.xml is more 
important. The locationmap can be debugged if you know what you are 
doing. So this will become a user support issue until we can patch the 
locationmap (see my reply to Thorsten).

> Perhaps a way out for now is to describe the issue
> in Jira, add a note to the upgrading_08.html doc, and
> add to the end of the "site" ant target to scan the
> build directory to list any error_site_* files.
 >
> That is not as good as the previous broken link handling
> because it doesn't list the referring pages.

Which, for me is very important.

I've just commented out the relevant match in our root sitemap to 
restore the old behaviour before the code freeze. I think we ought to 
document the bad locationmap error message in the upgrading doc, along 
with a hint on debugging via the locationmap logs (set the log level to 
debug in logkit.xml), and a pointer to the mail lists for assistance. 
(for now I've put a one line warning, better than nothing but needs 
improving since reading the locationmap logs is not easy - I'm going to 
bed now, sorry)

If someone manages to change the way the locationmap works and get it to 
throw a proper error before the code freeze then we can reinstate the 
error trapping. Otherwise we'll do it in the next release (which will 
not be so long in the making - we can discuss that after the release).

Ross

Re: [headsup] recent change in broken link handling

Posted by David Crossley <cr...@apache.org>.
Ross Gardler wrote:
> David Crossley wrote:
> >The handling of broken site: links has recently changed.
> >
> >It used to report the link as BROKEN in the site build
> >and write its details to the broken-links.xml file.
> >
> >Now it generates a file called error_site_$linkname
> >with the "No pipeline matched request:" info inside it.
> >
> >I wonder if it is related to the recent change in
> >error handling for the locationmap.
> 
> I think it probably is a side effect of the error handling. And not (in 
> its current form) a good one.

Note for the archives: See FOR-701 and r521063.

> I'm guessing it happens because now, when there is an error, Cocoon 
> traps it and creates a file describing the error. The crawler must be 
> treating this as a proper page.
> 
> So, before the code freeze we need to decide what to do:
> 
> 1) revert the error handling which will give us meaningless locationmap 
> errors, but will give a meaningful broken-links.xml file
> 
> 2) Improve the generated error page to be more user friendly and show 
> the error in a graceful way, this will make the user experience better 
> if a broken site is deployed, but will make the admins job harder
> 
> 3) A mix of 1) and 2) in which we generate a nice user error page and we 
> use the SourceWritingTransformer to provide meaningful output for the admin.
> 
> 4) ?
> 
> Thoughts?

Dunno.

Better locationmap error messages are more important,
because that can be complex to debug.

Perhaps a way out for now is to describe the issue
in Jira, add a note to the upgrading_08.html doc, and
add to the end of the "site" ant target to scan the
build directory to list any error_site_* files.

That is not as good as the previous broken link handling
because it doesn't list the referring pages.

Another big issue is that we now do not get a "Build failed"
message when there is a linking error.

-David

Re: [headsup] recent change in broken link handling

Posted by Ross Gardler <rg...@apache.org>.
David Crossley wrote:
> The handling of broken site: links has recently changed.
> 
> It used to report the link as BROKEN in the site build
> and write its details to the broken-links.xml file.
> 
> Now it generates a file called error_site_$linkname
> with the "No pipeline matched request:" info inside it.
> 
> I wonder if it is related to the recent change in
> error handling for the locationmap.

I think it probably is a side effect of the error handling. And not (in 
its current form) a good one.

I'm guessing it happens because now, when there is an error, Cocoon 
traps it and creates a file describing the error. The crawler must be 
treating this as a proper page.

So, before the code freeze we need to decide what to do:

1) revert the error handling which will give us meaningless locationmap 
errors, but will give a meaningful broken-links.xml file

2) Improve the generated error page to be more user friendly and show 
the error in a graceful way, this will make the user experience better 
if a broken site is deployed, but will make the admins job harder

3) A mix of 1) and 2) in which we generate a nice user error page and we 
use the SourceWritingTransformer to provide meaningful output for the admin.

4) ?

Thoughts?

Ross