You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Thorsten Scherler <th...@juntadeandalucia.es> on 2007/05/10 10:19:19 UTC

web.archive.org stoped indexing our site

Hi all,

when I was trying to find the origin of the dead link reported by Bart I
found out that the waybackmaschine has stop indexing our web site.
http://web.archive.org/web/*/http://forrest.apache.org

The last entry is Apr 24, 2006, which makes me think whether we changed
the .htaccess or robot.txt to exclude crawlers. Is this by purpose?

salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


Re: web.archive.org stoped indexing our site

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Thu, 2007-05-10 at 19:33 +1000, David Crossley wrote:
> Thorsten Scherler wrote:
> > Hi all,
> > 
> > when I was trying to find the origin of the dead link reported by Bart I
> > found out that the waybackmaschine has stop indexing our web site.
> > http://web.archive.org/web/*/http://forrest.apache.org
> > 
> > The last entry is Apr 24, 2006, which makes me think whether we changed
> > the .htaccess or robot.txt to exclude crawlers. Is this by purpose?
> 
> I am not aware anything changed at our end.
> 
> Try our dev archives, Ferdinand and i talked
> about this before.
> 
> You get a similar result for www.apache.org
> 
> I wonder if the results are deliberately old,
> i.e. they don't show the last six months results
> so that people don't use it as a daily tool.

Ah, ok that makes sense and explains the gap. Maybe they just extended
the period to a full year.

salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


Re: web.archive.org stoped indexing our site

Posted by David Crossley <cr...@apache.org>.
Thorsten Scherler wrote:
> Hi all,
> 
> when I was trying to find the origin of the dead link reported by Bart I
> found out that the waybackmaschine has stop indexing our web site.
> http://web.archive.org/web/*/http://forrest.apache.org
> 
> The last entry is Apr 24, 2006, which makes me think whether we changed
> the .htaccess or robot.txt to exclude crawlers. Is this by purpose?

I am not aware anything changed at our end.

Try our dev archives, Ferdinand and i talked
about this before.

You get a similar result for www.apache.org

I wonder if the results are deliberately old,
i.e. they don't show the last six months results
so that people don't use it as a daily tool.

-David