You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Emmanuel <jo...@gmail.com> on 2007/08/02 18:02:01 UTC

Dedup

Dedup process are quite usefull, unfortunetely the url of the content
deleted are not removed from the Crawldb.

Don't u think we could either remove it from the DB or change the status and
fetchinterval to avoid to fetch it again so quickly ?