You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by sumittyagi <pi...@gmail.com> on 2009/11/17 19:48:48 UTC

Filtering Pages while crawling

How can I filter certain pages like Privacy Policies, Terms and conditions
etc from crawling, because all these pages contains bogus information. I am
new to nutch.
Please let me know about this. 

Thanks in Advance.
-- 
View this message in context: http://old.nabble.com/Filtering-Pages-while-crawling-tp26395359p26395359.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.