You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Varun Thacker <va...@vthacker.in> on 2018/10/02 05:17:20 UTC

Re: Rule-based replication or sharing

Hi Chuck,

I was chatting with Noble offline and he suggested we could use this
starting 7.5

*{replica:'#EQUAL', shard : ''#EACH' , sysprop.az <http://sysprop.az>
:'#EACH'}*

where "az" is a sysprop while starting each solr instance ( -Daz=us-east-1 )

It's documented
https://lucene.apache.org/solr/guide/7_5/solrcloud-autoscaling-policy-preferences.html

Let me know if this works for you.


On Wed, Sep 26, 2018 at 9:11 AM Chuck Reynolds <cr...@ancestry.com>
wrote:

> Noble,
>
> Are you saying in the latest version of Solr that this would work with
> three instances of Solr running on each server?
>
> If so how?
>
> Thanks again for your help.
>
> On 9/26/18, 9:11 AM, "Noble Paul" <no...@gmail.com> wrote:
>
>     I'm not sure if it is pertinent to ask you to move to the latest Solr
>     which has the policy based replica placement. Unfortunately, I don't
>     have any other solution I can think of
>
>     On Wed, Sep 26, 2018 at 11:46 PM Chuck Reynolds <
> creynolds@ancestry.com> wrote:
>     >
>     > Noble,
>     >
>     > So other than manually moving replicas of shard do you have a
> suggestion of how one might accomplish the multiple availability zone with
> multiple instances of Solr running on each server?
>     >
>     > Thanks
>     >
>     > On 9/26/18, 12:56 AM, "Noble Paul" <no...@gmail.com> wrote:
>     >
>     >     The rules suggested by Steve is correct. I tested it locally and
> I got
>     >     the same errors. That means a bug exists probably.
>     >     All the new development efforts are invested in the new policy
> feature
>     >     .
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_7-5F4_solrcloud-2Dautoscaling-2Dpolicy-2Dpreferences.html&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=yXVYNcm-dqN_lucLyuQI38EZfK4f8l4828Ty53e4plM&s=D1vfu3bOu_hOGAU2CIKPwqBTPkYiBeK1kOUoFnQZpKA&e=
>     >
>     >     The old one is going to be deprecated pretty soon. So, I'm not
> sure if
>     >     we should be investing our resources here
>     >     On Wed, Sep 26, 2018 at 1:23 PM Chuck Reynolds <
> creynolds@ancestry.com> wrote:
>     >     >
>     >     > Shawn,
>     >     >
>     >     > Thanks for the info. We’ve been running this way for the past
> 4 years.
>     >     >
>     >     > We were running on very large hardware, 20 physical cores with
> 256 gigs of ram with 3 billion document and it was the only way we could
> take advantage of the hardware.
>     >     >
>     >     > Running 1 Solr instance per server never gave us the
> throughput we needed.
>     >     >
>     >     > So I somewhat disagree with your statement because our test
> proved otherwise.
>     >     >
>     >     > Thanks for the info.
>     >     >
>     >     > Sent from my iPhone
>     >     >
>     >     > > On Sep 25, 2018, at 4:19 PM, Shawn Heisey <
> apache@elyograg.org> wrote:
>     >     > >
>     >     > >> On 9/25/2018 9:21 AM, Chuck Reynolds wrote:
>     >     > >> Each server has three instances of Solr running on it so
> every instance on the server has to be in the same replica set.
>     >     > >
>     >     > > You should be running exactly one Solr instance per server.
> When evaluating rules for replica placement, SolrCloud will treat each
> instance as completely separate from all others, including others on the
> same machine.  It will not know that those three instances are on the same
> machine.  One Solr instance can handle MANY indexes.
>     >     > >
>     >     > > There is only ONE situation where it makes sense to run
> multiple instances per machine, and in my strong opinion, even that
> situation should not be handled with multiple instances. That situation is
> this:  When running one instance would require a REALLY large heap.
> Garbage collection pauses can become extreme in that situation, so some
> people will run multiple instances that each have a smaller heap, and
> divide their indexes between them. In my opinion, when you have enough
> index data on an instance that it requires a huge heap, instead of running
> two or more instances on one server, it's time to add more servers.
>     >     > >
>     >     > > Thanks,
>     >     > > Shawn
>     >     > >
>     >
>     >
>     >
>     >     --
>     >     -----------------------------------------------------
>     >     Noble Paul
>     >
>     >
>
>
>     --
>     -----------------------------------------------------
>     Noble Paul
>
>
>

Re: Rule-based replication or sharing

Posted by Shawn Heisey <ap...@elyograg.org>.

On 10/2/2018 9:11 AM, Chuck Reynolds wrote:
> Until we move to Solr 7.5 is there a way that we can control sharding with the core.properties file?
>
> It seems to me that you use to be able to put a core.properties file in the Solr home path with something like the following.
>
> coreNodeName=bts_shard3_01
> shard=shard3
> collection=BTS
>
> Then start Solr and it would create the sharding base on the information in the core.properties file.
>
> When I try it with Solr 6.6 it seem to ignore the core.properties file.

When running SolrCloud, don't try to manually add cores, mess with the 
core.properties file, or use the CoreAdmin API unless you understand 
****EXACTLY**** how SolrCloud works internally.  And even if you do have 
that level of understanding, I strongly recommend not doing it.  It's 
easy to get wrong.  Use the Collections API to make changes to your 
indexes.  Virtually any action that people need to do to their indexes 
is supported by the Collections API, and if there's something important 
missing, then we can talk about adding it.  If the Collections API is 
bypassed, there's a good chance that something will be missing/incorrect 
in either zookeeper or core.properties, maybe both.

If you're trying to create a new shard on a collection with the implicit 
router, this is probably what you're looking for:

https://lucene.apache.org/solr/guide/7_5/collections-api.html#createshard

Thanks,
Shawn

Re: Rule-based replication or sharing

Posted by Chuck Reynolds <cr...@ancestry.com>.

Thanks Varun,

Until we move to Solr 7.5 is there a way that we can control sharding with the core.properties file?

It seems to me that you use to be able to put a core.properties file in the Solr home path with something like the following.

coreNodeName=bts_shard3_01
shard=shard3
collection=BTS

Then start Solr and it would create the sharding base on the information in the core.properties file.

When I try it with Solr 6.6 it seem to ignore the core.properties file.


Thanks again for your help

On 10/1/18, 11:21 PM, "Varun Thacker" <va...@vthacker.in> wrote:

    Hi Chuck,
    
    I was chatting with Noble offline and he suggested we could use this
    starting 7.5
    
    {replica:'#EQUAL', shard : ''#EACH' , sysprop.az :'#EACH'}
    
    where "az" is a sysprop while starting each solr instance ( -Daz=us-east-1 )
    
    It's documented
    https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_7-5F5_solrcloud-2Dautoscaling-2Dpolicy-2Dpreferences.html&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=X42m0Brzz07baCIHc6ET01cc-NNABXey5TQ9MH5Xpgk&s=EieqaN5wcawBMOhwizH3PjnTnu--ixDDLyLs9zRzDa8&e=
    
    Let me know if this works for you.
    
    ( Looks like my previous email had some formatting issues )
    
    On Mon, Oct 1, 2018 at 10:17 PM Varun Thacker <va...@vthacker.in> wrote:
    
    > Hi Chuck,
    >
    > I was chatting with Noble offline and he suggested we could use this
    > starting 7.5
    >
    > *{replica:'#EQUAL', shard : ''#EACH' , sysprop.az <https://urldefense.proofpoint.com/v2/url?u=http-3A__sysprop.az&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=X42m0Brzz07baCIHc6ET01cc-NNABXey5TQ9MH5Xpgk&s=Vq1BAf4MKvMlSKiKZsTvnMpz7uu7FP7VV3EPW4bNEiU&e=>
    > :'#EACH'}*
    >
    > where "az" is a sysprop while starting each solr instance ( -Daz=us-east-1
    > )
    >
    > It's documented
    > https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_7-5F5_solrcloud-2Dautoscaling-2Dpolicy-2Dpreferences.html&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=X42m0Brzz07baCIHc6ET01cc-NNABXey5TQ9MH5Xpgk&s=EieqaN5wcawBMOhwizH3PjnTnu--ixDDLyLs9zRzDa8&e=
    >
    > Let me know if this works for you.
    >
    >
    > On Wed, Sep 26, 2018 at 9:11 AM Chuck Reynolds <cr...@ancestry.com>
    > wrote:
    >
    >> Noble,
    >>
    >> Are you saying in the latest version of Solr that this would work with
    >> three instances of Solr running on each server?
    >>
    >> If so how?
    >>
    >> Thanks again for your help.
    >>
    >> On 9/26/18, 9:11 AM, "Noble Paul" <no...@gmail.com> wrote:
    >>
    >>     I'm not sure if it is pertinent to ask you to move to the latest Solr
    >>     which has the policy based replica placement. Unfortunately, I don't
    >>     have any other solution I can think of
    >>
    >>     On Wed, Sep 26, 2018 at 11:46 PM Chuck Reynolds <
    >> creynolds@ancestry.com> wrote:
    >>     >
    >>     > Noble,
    >>     >
    >>     > So other than manually moving replicas of shard do you have a
    >> suggestion of how one might accomplish the multiple availability zone with
    >> multiple instances of Solr running on each server?
    >>     >
    >>     > Thanks
    >>     >
    >>     > On 9/26/18, 12:56 AM, "Noble Paul" <no...@gmail.com> wrote:
    >>     >
    >>     >     The rules suggested by Steve is correct. I tested it locally
    >> and I got
    >>     >     the same errors. That means a bug exists probably.
    >>     >     All the new development efforts are invested in the new policy
    >> feature
    >>     >     .
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_7-5F4_solrcloud-2Dautoscaling-2Dpolicy-2Dpreferences.html&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=yXVYNcm-dqN_lucLyuQI38EZfK4f8l4828Ty53e4plM&s=D1vfu3bOu_hOGAU2CIKPwqBTPkYiBeK1kOUoFnQZpKA&e=
    >>     >
    >>     >     The old one is going to be deprecated pretty soon. So, I'm not
    >> sure if
    >>     >     we should be investing our resources here
    >>     >     On Wed, Sep 26, 2018 at 1:23 PM Chuck Reynolds <
    >> creynolds@ancestry.com> wrote:
    >>     >     >
    >>     >     > Shawn,
    >>     >     >
    >>     >     > Thanks for the info. We’ve been running this way for the past
    >> 4 years.
    >>     >     >
    >>     >     > We were running on very large hardware, 20 physical cores
    >> with 256 gigs of ram with 3 billion document and it was the only way we
    >> could take advantage of the hardware.
    >>     >     >
    >>     >     > Running 1 Solr instance per server never gave us the
    >> throughput we needed.
    >>     >     >
    >>     >     > So I somewhat disagree with your statement because our test
    >> proved otherwise.
    >>     >     >
    >>     >     > Thanks for the info.
    >>     >     >
    >>     >     > Sent from my iPhone
    >>     >     >
    >>     >     > > On Sep 25, 2018, at 4:19 PM, Shawn Heisey <
    >> apache@elyograg.org> wrote:
    >>     >     > >
    >>     >     > >> On 9/25/2018 9:21 AM, Chuck Reynolds wrote:
    >>     >     > >> Each server has three instances of Solr running on it so
    >> every instance on the server has to be in the same replica set.
    >>     >     > >
    >>     >     > > You should be running exactly one Solr instance per
    >> server.  When evaluating rules for replica placement, SolrCloud will treat
    >> each instance as completely separate from all others, including others on
    >> the same machine.  It will not know that those three instances are on the
    >> same machine.  One Solr instance can handle MANY indexes.
    >>     >     > >
    >>     >     > > There is only ONE situation where it makes sense to run
    >> multiple instances per machine, and in my strong opinion, even that
    >> situation should not be handled with multiple instances. That situation is
    >> this:  When running one instance would require a REALLY large heap.
    >> Garbage collection pauses can become extreme in that situation, so some
    >> people will run multiple instances that each have a smaller heap, and
    >> divide their indexes between them. In my opinion, when you have enough
    >> index data on an instance that it requires a huge heap, instead of running
    >> two or more instances on one server, it's time to add more servers.
    >>     >     > >
    >>     >     > > Thanks,
    >>     >     > > Shawn
    >>     >     > >
    >>     >
    >>     >
    >>     >
    >>     >     --
    >>     >     -----------------------------------------------------
    >>     >     Noble Paul
    >>     >
    >>     >
    >>
    >>
    >>     --
    >>     -----------------------------------------------------
    >>     Noble Paul
    >>
    >>
    >>

Re: Rule-based replication or sharing

Posted by Varun Thacker <va...@vthacker.in>.

Hi Chuck,

I was chatting with Noble offline and he suggested we could use this
starting 7.5

{replica:'#EQUAL', shard : ''#EACH' , sysprop.az :'#EACH'}

where "az" is a sysprop while starting each solr instance ( -Daz=us-east-1 )

It's documented
https://lucene.apache.org/solr/guide/7_5/solrcloud-autoscaling-policy-preferences.html

Let me know if this works for you.

( Looks like my previous email had some formatting issues )

On Mon, Oct 1, 2018 at 10:17 PM Varun Thacker <va...@vthacker.in> wrote:

> Hi Chuck,
>
> I was chatting with Noble offline and he suggested we could use this
> starting 7.5
>
> *{replica:'#EQUAL', shard : ''#EACH' , sysprop.az <http://sysprop.az>
> :'#EACH'}*
>
> where "az" is a sysprop while starting each solr instance ( -Daz=us-east-1
> )
>
> It's documented
> https://lucene.apache.org/solr/guide/7_5/solrcloud-autoscaling-policy-preferences.html
>
> Let me know if this works for you.
>
>
> On Wed, Sep 26, 2018 at 9:11 AM Chuck Reynolds <cr...@ancestry.com>
> wrote:
>
>> Noble,
>>
>> Are you saying in the latest version of Solr that this would work with
>> three instances of Solr running on each server?
>>
>> If so how?
>>
>> Thanks again for your help.
>>
>> On 9/26/18, 9:11 AM, "Noble Paul" <no...@gmail.com> wrote:
>>
>>     I'm not sure if it is pertinent to ask you to move to the latest Solr
>>     which has the policy based replica placement. Unfortunately, I don't
>>     have any other solution I can think of
>>
>>     On Wed, Sep 26, 2018 at 11:46 PM Chuck Reynolds <
>> creynolds@ancestry.com> wrote:
>>     >
>>     > Noble,
>>     >
>>     > So other than manually moving replicas of shard do you have a
>> suggestion of how one might accomplish the multiple availability zone with
>> multiple instances of Solr running on each server?
>>     >
>>     > Thanks
>>     >
>>     > On 9/26/18, 12:56 AM, "Noble Paul" <no...@gmail.com> wrote:
>>     >
>>     >     The rules suggested by Steve is correct. I tested it locally
>> and I got
>>     >     the same errors. That means a bug exists probably.
>>     >     All the new development efforts are invested in the new policy
>> feature
>>     >     .
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_7-5F4_solrcloud-2Dautoscaling-2Dpolicy-2Dpreferences.html&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=yXVYNcm-dqN_lucLyuQI38EZfK4f8l4828Ty53e4plM&s=D1vfu3bOu_hOGAU2CIKPwqBTPkYiBeK1kOUoFnQZpKA&e=
>>     >
>>     >     The old one is going to be deprecated pretty soon. So, I'm not
>> sure if
>>     >     we should be investing our resources here
>>     >     On Wed, Sep 26, 2018 at 1:23 PM Chuck Reynolds <
>> creynolds@ancestry.com> wrote:
>>     >     >
>>     >     > Shawn,
>>     >     >
>>     >     > Thanks for the info. We’ve been running this way for the past
>> 4 years.
>>     >     >
>>     >     > We were running on very large hardware, 20 physical cores
>> with 256 gigs of ram with 3 billion document and it was the only way we
>> could take advantage of the hardware.
>>     >     >
>>     >     > Running 1 Solr instance per server never gave us the
>> throughput we needed.
>>     >     >
>>     >     > So I somewhat disagree with your statement because our test
>> proved otherwise.
>>     >     >
>>     >     > Thanks for the info.
>>     >     >
>>     >     > Sent from my iPhone
>>     >     >
>>     >     > > On Sep 25, 2018, at 4:19 PM, Shawn Heisey <
>> apache@elyograg.org> wrote:
>>     >     > >
>>     >     > >> On 9/25/2018 9:21 AM, Chuck Reynolds wrote:
>>     >     > >> Each server has three instances of Solr running on it so
>> every instance on the server has to be in the same replica set.
>>     >     > >
>>     >     > > You should be running exactly one Solr instance per
>> server.  When evaluating rules for replica placement, SolrCloud will treat
>> each instance as completely separate from all others, including others on
>> the same machine.  It will not know that those three instances are on the
>> same machine.  One Solr instance can handle MANY indexes.
>>     >     > >
>>     >     > > There is only ONE situation where it makes sense to run
>> multiple instances per machine, and in my strong opinion, even that
>> situation should not be handled with multiple instances. That situation is
>> this:  When running one instance would require a REALLY large heap.
>> Garbage collection pauses can become extreme in that situation, so some
>> people will run multiple instances that each have a smaller heap, and
>> divide their indexes between them. In my opinion, when you have enough
>> index data on an instance that it requires a huge heap, instead of running
>> two or more instances on one server, it's time to add more servers.
>>     >     > >
>>     >     > > Thanks,
>>     >     > > Shawn
>>     >     > >
>>     >
>>     >
>>     >
>>     >     --
>>     >     -----------------------------------------------------
>>     >     Noble Paul
>>     >
>>     >
>>
>>
>>     --
>>     -----------------------------------------------------
>>     Noble Paul
>>
>>
>>