You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by xiao yang <ya...@gmail.com> on 2009/07/03 21:06:50 UTC

what's the relationship between nutch, solr, lucene, and hadoop

Hi, guys,

I'm quite confused with them.
It seems that nutch contains solr, lucene, and hadoop. But I'm not quite sure.
What roles are they playing for a search engine?
I know lucene is for index, hadoop is for storage. What about the
nutch and solr?

Thanks!
Xiao

Re: what's the relationship between nutch, solr, lucene, and hadoop

Posted by jo...@findwise.se.
Lucene is the index library.
Solr is the interface to use the lucene index.
Hadoop is used to perform tasks distributed.
Nutch fetches e.g. web pages (by using hadoop distribution) and may insert
them into the lucene index by using the Solr.

I hope this was clear/correct enough

Regards Johan S.


> Hi, guys,
>
> I'm quite confused with them.
> It seems that nutch contains solr, lucene, and hadoop. But I'm not quite
> sure.
> What roles are they playing for a search engine?
> I know lucene is for index, hadoop is for storage. What about the
> nutch and solr?
>
> Thanks!
> Xiao
>