You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by reinhard schwab <re...@aon.at> on 2009/08/22 20:31:01 UTC

Re: crawldb not updating

either you have no seed urls or your filter is to restrictive.
also take care that nutch crawl will use conf/crawl-urlfilter.txt by
default and
not conf/regex-urlfilter.txt!

Aditya Sakhuja schrieb:
> I am having issues getting the data injected into the crawldb. I have set
> the filter in the conf/regex-urlfilter.txt file. and also set the seeding
> urls as expected.
>
> The command I am using is :
>
> nutch crawl ~/urls -dir crawl -depth 3 -topN 50
>
> I do not get any errors, however the crawldb is not updated either.
>
> Any one successfully got the crawldb injected ?
>
> Thanks,
> Aditya
>
>