You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by zhang gaozhi <ga...@teltel.com> on 2007/12/06 15:51:40 UTC

Question about nutch and solr

 Dear,

       Now, i am looking into the nutch, solr, lucence and hadoop.
       Because the nutch can work with hadoop and our application is
special, we want to use the nucth to do the index about the xml format file,
and store the index files into hadoop. Thenly, we use the solr to do
seaching from index files in hadoop.
      My question is whether the nutch can work well for that or not. If
not, what do we need to do ?
      Thanks for your reply.

 thanks

 gaozhi

Re: Question about nutch and solr

Posted by Enis Soztutar <en...@gmail.com>.
Hi,

To clarify things a bit, let me explain lucene and her children a bit.

Lucene : an inverted indexing library,
Solr      : a kind of index server application, that wraps and extends 
the capabilities of lucene.
Hadooop : an implementation of mapreduce and DFS
Nutch  :  a search engine build on top of hadoop , lucene, and solr(very 
soon).

Your architecture will very much depend on your schema of your data and 
the way you will use it. You can store your data in DFS(hadoop), use 
nutch to build the index(either by lucene or solr). Then serve the index 
using either solr or nutch. But you have to copy the indexes to 
local(serving indexes from dfs is very slow). I think it is better you 
give some more insight about your problem and your data.


zhang gaozhi wrote:
>  Dear,
>
>        Now, i am looking into the nutch, solr, lucence and hadoop.
>        Because the nutch can work with hadoop and our application is
> special, we want to use the nucth to do the index about the xml format file,
> and store the index files into hadoop. Thenly, we use the solr to do
> seaching from index files in hadoop.
>       My question is whether the nutch can work well for that or not. If
> not, what do we need to do ?
>       Thanks for your reply.
>
>  thanks
>
>  gaozhi
>
>