You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Software Dev <st...@gmail.com> on 2014/03/25 17:42:16 UTC

Replication (Solr Cloud)

I see that by default in SolrCloud that my collections are
replicating. Should this be disabled in SolrCloud as this is already
handled by it?

>From the documentation:

"The Replication screen shows you the current replication state for
the named core you have specified. In Solr, replication is for the
index only. SolrCloud has supplanted much of this functionality, but
if you are still using index replication, you can use this screen to
see the replication state:"

I just want to make sure before I disable it that if we send an update
to one server that the document will be correctly replicated across
all nodes. Thanks

Re: Replication (Solr Cloud)

Posted by Software Dev <st...@gmail.com>.
"In older versions it might have done them all at once, but I believe
that newer versions only do one core at a time."

It looks like it did it all at once and I'm on the latest (4.7)

On Tue, Mar 25, 2014 at 11:27 AM, Software Dev
<st...@gmail.com> wrote:
> So its generally a bad idea to optimize I gather?
>
> - In older versions it might have done them all at once, but I believe
> that newer versions only do one core at a time.
>
> On Tue, Mar 25, 2014 at 11:16 AM, Shawn Heisey <so...@elyograg.org> wrote:
>> On 3/25/2014 11:59 AM, Software Dev wrote:
>>>
>>> Ehh.. found out the hard way. I optimized the collection on 1 machine
>>> and when it was completed it replicated to the others and took my
>>> cluster down. Shitty
>>
>>
>> It doesn't get replicated -- each core in the collection will be optimized.
>> In older versions it might have done them all at once, but I believe that
>> newer versions only do one core at a time.
>>
>> Doing an optimize on a Solr core results in a LOT of I/O. If your Solr
>> install is having performance issues, that will push it over the edge.  When
>> SolrCloud ends up with a performance problem in one place, they tend to
>> multiply and cause MORE problems.  It can get bad enough that the whole
>> cluster goes down because it's trying to do a recovery on every node.  For
>> that reason, it's extremely important that you have enough system resources
>> available across your cloud (RAM in particular) to avoid performance issues.
>>
>> Thanks,
>> Shawn
>>

Re: Replication (Solr Cloud)

Posted by Walter Underwood <wu...@wunderwood.org>.
Yes, it is generally a bad idea to optimize.

The system continually does merges as needed. You generally do not need to force a full merge.

wunder

On Mar 25, 2014, at 11:27 AM, Software Dev <st...@gmail.com> wrote:

> So its generally a bad idea to optimize I gather?
> 
> - In older versions it might have done them all at once, but I believe
> that newer versions only do one core at a time.
> 
> On Tue, Mar 25, 2014 at 11:16 AM, Shawn Heisey <so...@elyograg.org> wrote:
>> On 3/25/2014 11:59 AM, Software Dev wrote:
>>> 
>>> Ehh.. found out the hard way. I optimized the collection on 1 machine
>>> and when it was completed it replicated to the others and took my
>>> cluster down. Shitty
>> 
>> 
>> It doesn't get replicated -- each core in the collection will be optimized.
>> In older versions it might have done them all at once, but I believe that
>> newer versions only do one core at a time.
>> 
>> Doing an optimize on a Solr core results in a LOT of I/O. If your Solr
>> install is having performance issues, that will push it over the edge.  When
>> SolrCloud ends up with a performance problem in one place, they tend to
>> multiply and cause MORE problems.  It can get bad enough that the whole
>> cluster goes down because it's trying to do a recovery on every node.  For
>> that reason, it's extremely important that you have enough system resources
>> available across your cloud (RAM in particular) to avoid performance issues.
>> 
>> Thanks,
>> Shawn
>> 

--
Walter Underwood
wunder@wunderwood.org




Re: Replication (Solr Cloud)

Posted by Software Dev <st...@gmail.com>.
So its generally a bad idea to optimize I gather?

- In older versions it might have done them all at once, but I believe
that newer versions only do one core at a time.

On Tue, Mar 25, 2014 at 11:16 AM, Shawn Heisey <so...@elyograg.org> wrote:
> On 3/25/2014 11:59 AM, Software Dev wrote:
>>
>> Ehh.. found out the hard way. I optimized the collection on 1 machine
>> and when it was completed it replicated to the others and took my
>> cluster down. Shitty
>
>
> It doesn't get replicated -- each core in the collection will be optimized.
> In older versions it might have done them all at once, but I believe that
> newer versions only do one core at a time.
>
> Doing an optimize on a Solr core results in a LOT of I/O. If your Solr
> install is having performance issues, that will push it over the edge.  When
> SolrCloud ends up with a performance problem in one place, they tend to
> multiply and cause MORE problems.  It can get bad enough that the whole
> cluster goes down because it's trying to do a recovery on every node.  For
> that reason, it's extremely important that you have enough system resources
> available across your cloud (RAM in particular) to avoid performance issues.
>
> Thanks,
> Shawn
>

Re: Replication (Solr Cloud)

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/25/2014 11:59 AM, Software Dev wrote:
> Ehh.. found out the hard way. I optimized the collection on 1 machine
> and when it was completed it replicated to the others and took my
> cluster down. Shitty

It doesn't get replicated -- each core in the collection will be 
optimized.  In older versions it might have done them all at once, but I 
believe that newer versions only do one core at a time.

Doing an optimize on a Solr core results in a LOT of I/O. If your Solr 
install is having performance issues, that will push it over the edge.  
When SolrCloud ends up with a performance problem in one place, they 
tend to multiply and cause MORE problems.  It can get bad enough that 
the whole cluster goes down because it's trying to do a recovery on 
every node.  For that reason, it's extremely important that you have 
enough system resources available across your cloud (RAM in particular) 
to avoid performance issues.

Thanks,
Shawn


Re: Replication (Solr Cloud)

Posted by Software Dev <st...@gmail.com>.
Ehh.. found out the hard way. I optimized the collection on 1 machine
and when it was completed it replicated to the others and took my
cluster down. Shitty

On Tue, Mar 25, 2014 at 10:46 AM, Software Dev
<st...@gmail.com> wrote:
> One other question. If I optimize a collection on one node, does this
> get replicated to all others when finished?
>
> On Tue, Mar 25, 2014 at 10:13 AM, Software Dev
> <st...@gmail.com> wrote:
>> Thanks for the reply. Ill make sure NOT to disable it.

Re: Replication (Solr Cloud)

Posted by Software Dev <st...@gmail.com>.
One other question. If I optimize a collection on one node, does this
get replicated to all others when finished?

On Tue, Mar 25, 2014 at 10:13 AM, Software Dev
<st...@gmail.com> wrote:
> Thanks for the reply. Ill make sure NOT to disable it.

Re: Replication (Solr Cloud)

Posted by Software Dev <st...@gmail.com>.
Thanks for the reply. Ill make sure NOT to disable it.

Re: Replication (Solr Cloud)

Posted by Michael Della Bitta <mi...@appinions.com>.
No, don't disable replication!

The way shards ordinarily keep up with updates is by sending every document
to each member of the shard. However, if a shard goes offline for a period
of time and comes back, replication is used to "catch up" that shard. So
you really need it on.

If you created your collection with the collections API and the required
bits are in schema.xml and solrconfig.xml, you should be good to go. See
https://wiki.apache.org/solr/SolrCloud#Required_Config

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

"The Science of Influence Marketing"

18 East 41st Street

New York, NY 10017

t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
w: appinions.com <http://www.appinions.com/>


On Tue, Mar 25, 2014 at 12:42 PM, Software Dev <st...@gmail.com>wrote:

> I see that by default in SolrCloud that my collections are
> replicating. Should this be disabled in SolrCloud as this is already
> handled by it?
>
> From the documentation:
>
> "The Replication screen shows you the current replication state for
> the named core you have specified. In Solr, replication is for the
> index only. SolrCloud has supplanted much of this functionality, but
> if you are still using index replication, you can use this screen to
> see the replication state:"
>
> I just want to make sure before I disable it that if we send an update
> to one server that the document will be correctly replicated across
> all nodes. Thanks
>

Re: Replication (Solr Cloud)

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/25/2014 10:42 AM, Software Dev wrote:
> I see that by default in SolrCloud that my collections are
> replicating. Should this be disabled in SolrCloud as this is already
> handled by it?
>
>  From the documentation:
>
> "The Replication screen shows you the current replication state for
> the named core you have specified. In Solr, replication is for the
> index only. SolrCloud has supplanted much of this functionality, but
> if you are still using index replication, you can use this screen to
> see the replication state:"
>
> I just want to make sure before I disable it that if we send an update
> to one server that the document will be correctly replicated across
> all nodes. Thanks

The replication handler must be configured for SolrCloud to operate 
properly ... but not in the way that you might think. This is a source 
of major confusion for those who are new to SolrCloud, especially if 
they already understand master/slave replication.

During normal operation, SolrCloud does NOT use replication.  
Replication is ONLY used to recover indexes.  When everything is working 
well, recovery only happens when a Solr instance starts up.

Every Solr instance will be a master.  If that Solr instance has *EVER* 
(since the last instance start) replicated its index from a shard 
leader, it will *also* say that it is a slave. These are NOT indications 
that a replication is occurring, they are just the current configuration 
state of the replication handler.

You can ignore everything you see on the replication tab if you are 
running SolrCloud.  It only has meaning at the moment a replication is 
happening, and that is completely automated by SolrCloud.

Thanks,
Shawn