You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by zhang gaozhi <ga...@teltel.com> on 2007/12/06 15:51:40 UTC
Question about nutch and solr
Dear,
Now, i am looking into the nutch, solr, lucence and hadoop.
Because the nutch can work with hadoop and our application is
special, we want to use the nucth to do the index about the xml format file,
and store the index files into hadoop. Thenly, we use the solr to do
seaching from index files in hadoop.
My question is whether the nutch can work well for that or not. If
not, what do we need to do ?
Thanks for your reply.
thanks
gaozhi
Re: Question about nutch and solr
Posted by Enis Soztutar <en...@gmail.com>.
Hi,
To clarify things a bit, let me explain lucene and her children a bit.
Lucene : an inverted indexing library,
Solr : a kind of index server application, that wraps and extends
the capabilities of lucene.
Hadooop : an implementation of mapreduce and DFS
Nutch : a search engine build on top of hadoop , lucene, and solr(very
soon).
Your architecture will very much depend on your schema of your data and
the way you will use it. You can store your data in DFS(hadoop), use
nutch to build the index(either by lucene or solr). Then serve the index
using either solr or nutch. But you have to copy the indexes to
local(serving indexes from dfs is very slow). I think it is better you
give some more insight about your problem and your data.
zhang gaozhi wrote:
> Dear,
>
> Now, i am looking into the nutch, solr, lucence and hadoop.
> Because the nutch can work with hadoop and our application is
> special, we want to use the nucth to do the index about the xml format file,
> and store the index files into hadoop. Thenly, we use the solr to do
> seaching from index files in hadoop.
> My question is whether the nutch can work well for that or not. If
> not, what do we need to do ?
> Thanks for your reply.
>
> thanks
>
> gaozhi
>
>