You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Rod Taylor <rb...@sitesell.com> on 2005/11/16 21:51:10 UTC
Log Newly Found Urls - Patch
The ability to figure out what new URLs are being added to the database
is more important than the ones being fetched.
When changing the regex-urlfilter or regex-normalize files this gives
instantaneous feedback where the urls being retrieved may still be
crawling old junk for some time after the edits took place.
--
Rod Taylor <rb...@sitesell.com>