You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Mr Hadoop <mr...@gmail.com> on 2009/12/04 20:51:47 UTC

What is the best choice: nutch/lucene or nutch/solr?

I am going over mailing list and still didn't find an answer.

For a project, I need to crawl the web, index it and merge that content with
another site's content which is stored inside the key-value storage system.

What is the best approach to merge these two contents in to a lucene index,
solr index or keep the index separate but merge during the search query
results?

Re: What is the best choice: nutch/lucene or nutch/solr?

Posted by Otis Gospodnetic <og...@yahoo.com>.
Sounds like Nutch for crawling to gather the data, custom tools to read the gathered data, call the KV store, construct SolrInputDocuments, and index those to Solr.  If you want Solr and not Lucene, which is a bigger question that I can't answer without knowing the details.

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Mr Hadoop <mr...@gmail.com>
> To: nutch-user@lucene.apache.org
> Sent: Fri, December 4, 2009 2:51:47 PM
> Subject: What is the best choice: nutch/lucene or nutch/solr?
> 
> I am going over mailing list and still didn't find an answer.
> 
> For a project, I need to crawl the web, index it and merge that content with
> another site's content which is stored inside the key-value storage system.
> 
> What is the best approach to merge these two contents in to a lucene index,
> solr index or keep the index separate but merge during the search query
> results?