You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by da...@ontrenet.com on 2011/10/19 16:04:33 UTC

Merging Remote Solr Indexes?

Hi,
  I thought of a useful capability if it doesn't already exist.

Is it possible to do an index merge between two remote Solr's?

To handle massive index-time scalability, wouldn't it be useful
to have distributed indexes accepting local input, then merge
them into one central index after?

Darren

Re: Merging Remote Solr Indexes?

Posted by Lance Norskog <go...@gmail.com>.
Merging indexes is not really useful- it won't make distributed search
any faster. There are features that don't work with distributed
search. Really, you are better off having shards with enough documents
so that relevance scoring is balanced.

On Thu, May 31, 2012 at 11:04 AM, sudarshan
<ch...@gmail.com> wrote:
> Hi All,
>       I'm new to Solr. I saw this post relating to Merging of indexes. I
> have a similar doubt. From the post, I understand that merging of indexes
> across different cores is possible only if the cores exist o a single
> machine.     I want to merge indexes of different machines. Can you please
> explain me the different ways of doing this?
>
> Say I have N+1 Solr engines of which there are N different masters and the
> remaining 1 is meant for merging all N indexes together.  How I have decided
> to merge N indexes to 1 is this.
>
> 1. Dynamically edit the solrconfig.xml file of the N+1st system to point as
> a slave to different master each time. Hence a total of N trials would be
> needed to cover all N masters.
> 2. During every trial I shall replicate the index of the master and store it
> in a different folder. Say index1 from master1, index2 from master2 .....
> indexn from masterN.
> 3. After all indexes are replicated and moved/renamed to local directory, I
> shall perform a merge of all indexes.
>
>
> What problems will I have in implementing this? How efficient would be this?
> I believe all index folders will have to be available locally to perform
> merging. If not, please tell me how better can I do merge remote indexes.
>
> Another question I have is about MergeFactor. If I set the mergefactor as 5,
> will Solr automatically takes care of merging the segments to 1 if the
> number of segments reach 5? How this can be exploited?
>
> Your assistance is sincerely appreciated.
>
> Regards,
> Sudarshan
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Merging-Remote-Solr-Indexes-tp3434412p3987090.html
> Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Lance Norskog
goksron@gmail.com

Re: Merging Remote Solr Indexes?

Posted by sudarshan <ch...@gmail.com>.
Hi All,
       I'm new to Solr. I saw this post relating to Merging of indexes. I
have a similar doubt. From the post, I understand that merging of indexes
across different cores is possible only if the cores exist o a single
machine.     I want to merge indexes of different machines. Can you please
explain me the different ways of doing this?

Say I have N+1 Solr engines of which there are N different masters and the
remaining 1 is meant for merging all N indexes together.  How I have decided
to merge N indexes to 1 is this.

1. Dynamically edit the solrconfig.xml file of the N+1st system to point as
a slave to different master each time. Hence a total of N trials would be
needed to cover all N masters.
2. During every trial I shall replicate the index of the master and store it
in a different folder. Say index1 from master1, index2 from master2 .....
indexn from masterN.
3. After all indexes are replicated and moved/renamed to local directory, I
shall perform a merge of all indexes.


What problems will I have in implementing this? How efficient would be this?
I believe all index folders will have to be available locally to perform
merging. If not, please tell me how better can I do merge remote indexes.

Another question I have is about MergeFactor. If I set the mergefactor as 5,
will Solr automatically takes care of merging the segments to 1 if the
number of segments reach 5? How this can be exploited?

Your assistance is sincerely appreciated.

Regards,
Sudarshan

 

--
View this message in context: http://lucene.472066.n3.nabble.com/Merging-Remote-Solr-Indexes-tp3434412p3987090.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Merging Remote Solr Indexes?

Posted by Darren Govoni <da...@ontrenet.com>.
Interesting Yury. Thanks.

On 10/20/2011 11:00 AM, Yury Kats wrote:
> On 10/19/2011 5:15 PM, Darren Govoni wrote:
>> Hi Otis,
>>      Yeah, I saw page, but it says for merging cores, which I presume
>> must reside locally to the solr instance doing the merging?
>> What I'm interested in doing is merging across solr instances running on
>> different machines into a single solr running on
>> another machine (programmatically). Is it still possible or did I
>> misread the wiki?
> Possible, but in a few steps.
> 1. Create new cores on "another" machine.
> 2. Replicate them from "different" machine.
> 3. Merge on "another" machine.
>
> All 3 steps can be done programmatically.


Re: Merging Remote Solr Indexes?

Posted by Yury Kats <yu...@yahoo.com>.
On 10/19/2011 5:15 PM, Darren Govoni wrote:
> Hi Otis,
>     Yeah, I saw page, but it says for merging cores, which I presume 
> must reside locally to the solr instance doing the merging?
> What I'm interested in doing is merging across solr instances running on 
> different machines into a single solr running on
> another machine (programmatically). Is it still possible or did I 
> misread the wiki?

Possible, but in a few steps.
1. Create new cores on "another" machine.
2. Replicate them from "different" machine.
3. Merge on "another" machine.

All 3 steps can be done programmatically.

Re: Merging Remote Solr Indexes?

Posted by Darren Govoni <da...@ontrenet.com>.
Actually, yeah. If you think about it a remote merge is like the inverse 
of replication.
Where replication is a one to many away from an index, the inverse would 
be merging many back to the one.
Sorta like a recall.

I think it would be a great analog to replication.

On 10/19/2011 06:18 PM, Otis Gospodnetic wrote:
> Darren,
>
> No, that is not possible without one copying an index/shard to a single machine on which you would then merge indices as described on the Wiki.
>
> Hmmmm, wouldn't it be nice to make use of existing replication code to make it possible to move shards around the cluster?
>
> Otis
> ----
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>> ________________________________
>> From: Darren Govoni<da...@ontrenet.com>
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, October 19, 2011 5:15 PM
>> Subject: Re: Merging Remote Solr Indexes?
>>
>> Hi Otis,
>>      Yeah, I saw page, but it says for merging cores, which I presume
>> must reside locally to the solr instance doing the merging?
>> What I'm interested in doing is merging across solr instances running on
>> different machines into a single solr running on
>> another machine (programmatically). Is it still possible or did I
>> misread the wiki?
>>
>> Thanks!
>> Darren
>>
>> On 10/19/2011 11:57 AM, Otis Gospodnetic wrote:
>>> Hi Darren,
>>>
>>> http://search-lucene.com/?q=solr+merge&fc_project=Solr
>>>
>>>
>>> Check hit #1
>>>
>>> Otis
>>> ----
>>>
>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>> Lucene ecosystem search :: http://search-lucene.com/
>>>
>>>
>>>> ________________________________
>>>> From: "darren@ontrenet.com"<da...@ontrenet.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Wednesday, October 19, 2011 10:04 AM
>>>> Subject: Merging Remote Solr Indexes?
>>>>
>>>>
>>>> Hi,
>>>>      I thought of a useful capability if it doesn't already exist.
>>>>
>>>> Is it possible to do an index merge between two remote Solr's?
>>>>
>>>> To handle massive index-time scalability, wouldn't it be useful
>>>> to have distributed indexes accepting local input, then merge
>>>> them into one central index after?
>>>>
>>>> Darren
>>>>
>>>>
>>>>
>>
>>
>>


Re: Merging Remote Solr Indexes?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Darren,

No, that is not possible without one copying an index/shard to a single machine on which you would then merge indices as described on the Wiki.

Hmmmm, wouldn't it be nice to make use of existing replication code to make it possible to move shards around the cluster?

Otis
----

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


>________________________________
>From: Darren Govoni <da...@ontrenet.com>
>To: solr-user@lucene.apache.org
>Sent: Wednesday, October 19, 2011 5:15 PM
>Subject: Re: Merging Remote Solr Indexes?
>
>Hi Otis,
>    Yeah, I saw page, but it says for merging cores, which I presume 
>must reside locally to the solr instance doing the merging?
>What I'm interested in doing is merging across solr instances running on 
>different machines into a single solr running on
>another machine (programmatically). Is it still possible or did I 
>misread the wiki?
>
>Thanks!
>Darren
>
>On 10/19/2011 11:57 AM, Otis Gospodnetic wrote:
>> Hi Darren,
>>
>> http://search-lucene.com/?q=solr+merge&fc_project=Solr
>>
>>
>> Check hit #1
>>
>> Otis
>> ----
>>
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> Lucene ecosystem search :: http://search-lucene.com/
>>
>>
>>> ________________________________
>>> From: "darren@ontrenet.com"<da...@ontrenet.com>
>>> To: solr-user@lucene.apache.org
>>> Sent: Wednesday, October 19, 2011 10:04 AM
>>> Subject: Merging Remote Solr Indexes?
>>>
>>>
>>> Hi,
>>>    I thought of a useful capability if it doesn't already exist.
>>>
>>> Is it possible to do an index merge between two remote Solr's?
>>>
>>> To handle massive index-time scalability, wouldn't it be useful
>>> to have distributed indexes accepting local input, then merge
>>> them into one central index after?
>>>
>>> Darren
>>>
>>>
>>>
>
>
>
>

Re: Merging Remote Solr Indexes?

Posted by Darren Govoni <da...@ontrenet.com>.
Hi Otis,
    Yeah, I saw page, but it says for merging cores, which I presume 
must reside locally to the solr instance doing the merging?
What I'm interested in doing is merging across solr instances running on 
different machines into a single solr running on
another machine (programmatically). Is it still possible or did I 
misread the wiki?

Thanks!
Darren

On 10/19/2011 11:57 AM, Otis Gospodnetic wrote:
> Hi Darren,
>
> http://search-lucene.com/?q=solr+merge&fc_project=Solr
>
>
> Check hit #1
>
> Otis
> ----
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>> ________________________________
>> From: "darren@ontrenet.com"<da...@ontrenet.com>
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, October 19, 2011 10:04 AM
>> Subject: Merging Remote Solr Indexes?
>>
>>
>> Hi,
>>    I thought of a useful capability if it doesn't already exist.
>>
>> Is it possible to do an index merge between two remote Solr's?
>>
>> To handle massive index-time scalability, wouldn't it be useful
>> to have distributed indexes accepting local input, then merge
>> them into one central index after?
>>
>> Darren
>>
>>
>>


Re: Merging Remote Solr Indexes?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Darren,

http://search-lucene.com/?q=solr+merge&fc_project=Solr


Check hit #1

Otis
----

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


>________________________________
>From: "darren@ontrenet.com" <da...@ontrenet.com>
>To: solr-user@lucene.apache.org
>Sent: Wednesday, October 19, 2011 10:04 AM
>Subject: Merging Remote Solr Indexes?
>
>
>Hi,
>  I thought of a useful capability if it doesn't already exist.
>
>Is it possible to do an index merge between two remote Solr's?
>
>To handle massive index-time scalability, wouldn't it be useful
>to have distributed indexes accepting local input, then merge
>them into one central index after?
>
>Darren
>
>
>