You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Bahadir Cambel <ba...@elasticb.com> on 2010/09/21 16:34:57 UTC

Relative urls are not crawled ?

Hey Guys ,

Our website constructed using the relative URLs like the menu links are
"/Products/default.html" , "/Brands/default.html"

Once Nutch crawl the website , I cannot see that these anchors are fetched
although I set the depth to 2. The end result index only contain 1 document.

If I run it against e.g http://androidyou.blogspot.com , I can see the other
URLs are fetched as well, and you can see that the links are full urls in
the web site.

Is there any configuration exists for this ?

Hope I had able to tell the issue clearly..

Kind regards ,
Bahadir Cambel