You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by sojianzhi master <so...@gmail.com> on 2009/08/19 07:39:15 UTC

hello,a question about crawl the internal relative web link.

hello ,everyone:

I have a question, when i crawl the web page, the relative link can't be
crawled and can't be added to fetch link list.
eg. the html code of the page to be crawled:
......
<a class="rootclass" href="/ProductShop/ShowClass.asp?ClassId=103">Internal
Link ...</a>
.....

but the Link: "/ProductShop/ShowClass.asp?ClassId=103" can't be added.
how the config nutch to fetch these links of  the page ?

any help will be much appreciated.