You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Neeb <mu...@hotmail.com> on 2010/06/22 17:40:03 UTC

Re: solr with hadoop

Hi,

We currently have a master-slave setup for solr with two slave servers. We
are using Solrj (stream-update-solr-server) to index master slave, which
takes 6 hours to index around 15 million documents.

I would like to explore hadoop, in particularly for indexing job using
mapreduce approach. 

- I have read some comments on the JIRA tickets, but it still seems unclear
how this setup will work. 
- I am not sure as what tasks will be done at map phase and what on reduce
phase. 
- And would it merge the multiple indices together into one during reduce
phase or is this a separate task out of mapreduce?

Any directions and guidance over this setup would be highly appreciated.

Thanks in advance,
-Ali
-- 
View this message in context: http://lucene.472066.n3.nabble.com/solr-with-hadoop-tp482688p914483.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr with hadoop

Posted by Marc Sturlese <ma...@gmail.com>.

I think a good solution could be to use hadoop with SOLR-1301 to build solr
shards and then use solr distributed search against these shards (you will
have to copy to local from HDFS to search against them)
-- 
View this message in context: http://lucene.472066.n3.nabble.com/solr-with-hadoop-tp482688p914576.html
Sent from the Solr - User mailing list archive at Nabble.com.