You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by DS jha <ae...@gmail.com> on 2008/06/17 05:04:06 UTC

getting seed list for vertical search engine

Hello,
We are in the process of developing a vertical search engine for the
medical industry – and I need to estimate server/sizing requirements
to setup my environment – my question is, how do I estimate how many
documents I will be fetching for a particular vertical?  And – from
where do I get the seed list of all the sites? Will dmoz health
category be sufficient or will I have to purchase a seed list?

Thanks