You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Rod Taylor <rb...@sitesell.com> on 2005/11/16 21:51:10 UTC

Log Newly Found Urls - Patch

The ability to figure out what new URLs are being added to the database
is more important than the ones being fetched.

When changing the regex-urlfilter or regex-normalize files this gives
instantaneous feedback where the urls being retrieved may still be
crawling old junk for some time after the edits took place.

-- 
Rod Taylor <rb...@sitesell.com>