You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Mike Smith <mi...@gmail.com> on 2006/09/26 20:16:31 UTC

How to crawl (store) only english pages?

Hi,

Is there any way to store only english pages at the crawling stage rather
than adding just the meta data lang:en to the index using language
identifier plugin?

Thanks, Mike