You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bing Li <lb...@gmail.com> on 2010/11/19 17:26:11 UTC

How to Transmit and Append Indexes

Hi, all,

I am working on a distributed searching system. Now I have one server only.
It has to crawl pages from the Web, generate indexes locally and respond
users' queries. I think this is too busy for it to work smoothly.

I plan to use two servers at at least. The jobs to crawl pages and generate
indexes are done by one of them. After that, the new available indexes
should be transmitted to anther one which is responsible for responding
users' queries. From users' point of view, this system must be fast.
However, I don't know how I can get the additional indexes which I can
transmit. After transmission, how to append them to the old indexes? Does
the appending block searching?

When generating indexex, Lucene is used. However, I cannot see the updates
so that I cannot send them. I know Hadoop does the above thing internally.
How can it be merged with Lucene?

Thanks so much for your help!

Bing Li

Re: How to Transmit and Append Indexes

Posted by Bing Li <lb...@gmail.com>.
Dear Marc,

I know the methodologies of Hadoop. My goal is to use Hadoop to manage
Lucene indexes. How to do that?

However, in Lucene, I CANNOT see the index updates. So I could not replicate
the updated to other nodes which are responsible for responding queries,
right?

It seems that Solr solves the problem. Solr has the similar features of
Hadoop?

However, using Solr, I must have Tomcat. It is a little heavy.

Thanks so much!
Bing Li

On Sat, Nov 20, 2010 at 1:49 AM, Marc Sturlese <ma...@gmail.com>wrote:

>
> You could implement some scripts to send to the slave the index updates
> using
> rsync. Do something similar to what Solr does:
> http://wiki.apache.org/solr/CollectionDistribution
>
> However, if what you want is a total merge of the indexes, you can do it
> easy with lucene:
>
> http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/index/IndexWriter.html#addIndexesNoOptimize(org.apache.lucene.store.Directory..<http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/index/IndexWriter.html#addIndexesNoOptimize%28org.apache.lucene.store.Directory..>
> .)
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-Transmit-and-Append-Indexes-tp1931444p1931881.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>

Re: How to Transmit and Append Indexes

Posted by Marc Sturlese <ma...@gmail.com>.
You could implement some scripts to send to the slave the index updates using
rsync. Do something similar to what Solr does:
http://wiki.apache.org/solr/CollectionDistribution

However, if what you want is a total merge of the indexes, you can do it
easy with lucene:
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/index/IndexWriter.html#addIndexesNoOptimize(org.apache.lucene.store.Directory...)



-- 
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Transmit-and-Append-Indexes-tp1931444p1931881.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.