You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Justin Hartman <jj...@gmail.com> on 2006/12/28 11:08:30 UTC
DmozParser Question
Hi All
I'm a newbie to Nutch and as such have a few questions. For now I'll
limit my questions simply because I want to try and see if I can get
my issues resolved myself but there is a question about the DmozParser
which I would like to ask.
Does anyone know if it is possible to filter the Dmoz file to only
include certain tld's such as .co.uk only in the dmoz/url file?
I noticed that DmozParser supports both boolean and pattern however
I'm not really sure how to implement it.
Any help appreciated.
--
Regards
Justin Hartman
PGP Key ID: 102CC123