You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Duncan, Adam" <Ad...@nordstrom.com> on 2018/07/04 16:35:42 UTC

AddReplica to shard with lowest node count

Hi all,

Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to 7.3
Currently we have a working scale-up approach for adding a new server to the cluster beyond the initial collection creation.
We’ve automated the install of Solr on new servers and, following that, we register the new instance with zookeeper so that the server will be included in the list of live nodes.
Finally we use the CoreAdmin API ‘Create’ command to associate the new node with our collection. Solr 5.1's CoreAdmin Create command would conveniently auto-assign the new node to the shard with the least nodes.

In Solr 7.3, the CoreAdmin API documentation warns us not to use the Create command with SolrCloud.
We tried 7.3’s CoreAdmin API Create command regardless and, unsurprisingly, it did not work.
The 7.3 documentation suggests we use the Collections API AddReplica command.The problem with AddReplica is that it expects us to specify the shard name.
This is unfortunate as it makes it hard for us to keep shards balanced. It puts the onus on us to work out the least populated shard via a call to the cluster status endpoint.
With that we now face the problem managing this correctly when scaling up multiple servers at once.

Are we missing something here? Is there really no way for a node to be auto-assigned to a shard in 7.3?
And if so, are there any recommendations for an approach to reliably doing this ourselves?

Thanks!
Adam

Re: AddReplica to shard with lowest node count

Posted by Gus Heck <gu...@gmail.com>.
Ah hmm I guess I didn't realize the autoscaling didn't use the rule based
stuff (haven't had opportunity to work with either). If it's deprecated,
maybe that suggests we need a highly visible warning box on the ref guide
page?

On Thu, Jul 5, 2018 at 12:18 AM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> The rule based replica placement was deprecated. The autoscaling APIs are
> the way to go. Please see
> http://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling.html
>
> Your use-case is interesting. By default, the trigger for nodeAdded event
> will move replicas from the most loaded nodes to the new node. That does
> not take care of your use-case. Can you please open a Jira to add this
> feature?
>
> On Thu, Jul 5, 2018 at 6:45 AM Gus Heck <gu...@gmail.com> wrote:
>
> > Perhaps the rule based replica placement stuff would do the trick?
> >
> > https://lucene.apache.org/solr/guide/7_3/rule-based-
> replica-placement.html
> >
> > I haven't used it myself but I've seen lots of work going into it
> lately...
> >
> > On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <Adam.Duncan@nordstrom.com
> >
> > wrote:
> >
> > > Hi all,
> > >
> > > Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to
> > 7.3
> > > Currently we have a working scale-up approach for adding a new server
> to
> > > the cluster beyond the initial collection creation.
> > > We’ve automated the install of Solr on new servers and, following that,
> > we
> > > register the new instance with zookeeper so that the server will be
> > > included in the list of live nodes.
> > > Finally we use the CoreAdmin API ‘Create’ command to associate the new
> > > node with our collection. Solr 5.1's CoreAdmin Create command would
> > > conveniently auto-assign the new node to the shard with the least
> nodes.
> > >
> > > In Solr 7.3, the CoreAdmin API documentation warns us not to use the
> > > Create command with SolrCloud.
> > > We tried 7.3’s CoreAdmin API Create command regardless and,
> > > unsurprisingly, it did not work.
> > > The 7.3 documentation suggests we use the Collections API AddReplica
> > > command.The problem with AddReplica is that it expects us to specify
> the
> > > shard name.
> > > This is unfortunate as it makes it hard for us to keep shards balanced.
> > It
> > > puts the onus on us to work out the least populated shard via a call to
> > the
> > > cluster status endpoint.
> > > With that we now face the problem managing this correctly when scaling
> up
> > > multiple servers at once.
> > >
> > > Are we missing something here? Is there really no way for a node to be
> > > auto-assigned to a shard in 7.3?
> > > And if so, are there any recommendations for an approach to reliably
> > doing
> > > this ourselves?
> > >
> > > Thanks!
> > > Adam
> > >
> >
> >
> >
> > --
> > http://www.the111shift.com
> >
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
http://www.the111shift.com

Re: AddReplica to shard with lowest node count

Posted by "Duncan, Adam" <Ad...@nordstrom.com>.
Thanks for your responses.

I’ve tried to get more familiar with the Autoscaling API. I’ve applied a nodeAdded trigger but I’m stuck trying to think of a cluster policy that would suit my scenario; something like “All new nodes need must have one replica from each available collection”
Is this possible? Or is that the point you were getting at by saying my use-case isn’t supported, Shalin? 

Regards,
Adam

On 7/4/18, 9:18 PM, "Shalin Shekhar Mangar" <sh...@gmail.com> wrote:

    The rule based replica placement was deprecated. The autoscaling APIs are
    the way to go. Please see
    http://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling.html
    
    Your use-case is interesting. By default, the trigger for nodeAdded event
    will move replicas from the most loaded nodes to the new node. That does
    not take care of your use-case. Can you please open a Jira to add this
    feature?
    
    On Thu, Jul 5, 2018 at 6:45 AM Gus Heck <gu...@gmail.com> wrote:
    
    > Perhaps the rule based replica placement stuff would do the trick?
    >
    > https://lucene.apache.org/solr/guide/7_3/rule-based-replica-placement.html
    >
    > I haven't used it myself but I've seen lots of work going into it lately...
    >
    > On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <Ad...@nordstrom.com>
    > wrote:
    >
    > > Hi all,
    > >
    > > Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to
    > 7.3
    > > Currently we have a working scale-up approach for adding a new server to
    > > the cluster beyond the initial collection creation.
    > > We’ve automated the install of Solr on new servers and, following that,
    > we
    > > register the new instance with zookeeper so that the server will be
    > > included in the list of live nodes.
    > > Finally we use the CoreAdmin API ‘Create’ command to associate the new
    > > node with our collection. Solr 5.1's CoreAdmin Create command would
    > > conveniently auto-assign the new node to the shard with the least nodes.
    > >
    > > In Solr 7.3, the CoreAdmin API documentation warns us not to use the
    > > Create command with SolrCloud.
    > > We tried 7.3’s CoreAdmin API Create command regardless and,
    > > unsurprisingly, it did not work.
    > > The 7.3 documentation suggests we use the Collections API AddReplica
    > > command.The problem with AddReplica is that it expects us to specify the
    > > shard name.
    > > This is unfortunate as it makes it hard for us to keep shards balanced.
    > It
    > > puts the onus on us to work out the least populated shard via a call to
    > the
    > > cluster status endpoint.
    > > With that we now face the problem managing this correctly when scaling up
    > > multiple servers at once.
    > >
    > > Are we missing something here? Is there really no way for a node to be
    > > auto-assigned to a shard in 7.3?
    > > And if so, are there any recommendations for an approach to reliably
    > doing
    > > this ourselves?
    > >
    > > Thanks!
    > > Adam
    > >
    >
    >
    >
    > --
    > http://www.the111shift.com
    >
    
    
    -- 
    Regards,
    Shalin Shekhar Mangar.
    


Re: AddReplica to shard with lowest node count

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
The rule based replica placement was deprecated. The autoscaling APIs are
the way to go. Please see
http://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling.html

Your use-case is interesting. By default, the trigger for nodeAdded event
will move replicas from the most loaded nodes to the new node. That does
not take care of your use-case. Can you please open a Jira to add this
feature?

On Thu, Jul 5, 2018 at 6:45 AM Gus Heck <gu...@gmail.com> wrote:

> Perhaps the rule based replica placement stuff would do the trick?
>
> https://lucene.apache.org/solr/guide/7_3/rule-based-replica-placement.html
>
> I haven't used it myself but I've seen lots of work going into it lately...
>
> On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <Ad...@nordstrom.com>
> wrote:
>
> > Hi all,
> >
> > Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to
> 7.3
> > Currently we have a working scale-up approach for adding a new server to
> > the cluster beyond the initial collection creation.
> > We’ve automated the install of Solr on new servers and, following that,
> we
> > register the new instance with zookeeper so that the server will be
> > included in the list of live nodes.
> > Finally we use the CoreAdmin API ‘Create’ command to associate the new
> > node with our collection. Solr 5.1's CoreAdmin Create command would
> > conveniently auto-assign the new node to the shard with the least nodes.
> >
> > In Solr 7.3, the CoreAdmin API documentation warns us not to use the
> > Create command with SolrCloud.
> > We tried 7.3’s CoreAdmin API Create command regardless and,
> > unsurprisingly, it did not work.
> > The 7.3 documentation suggests we use the Collections API AddReplica
> > command.The problem with AddReplica is that it expects us to specify the
> > shard name.
> > This is unfortunate as it makes it hard for us to keep shards balanced.
> It
> > puts the onus on us to work out the least populated shard via a call to
> the
> > cluster status endpoint.
> > With that we now face the problem managing this correctly when scaling up
> > multiple servers at once.
> >
> > Are we missing something here? Is there really no way for a node to be
> > auto-assigned to a shard in 7.3?
> > And if so, are there any recommendations for an approach to reliably
> doing
> > this ourselves?
> >
> > Thanks!
> > Adam
> >
>
>
>
> --
> http://www.the111shift.com
>


-- 
Regards,
Shalin Shekhar Mangar.

Re: AddReplica to shard with lowest node count

Posted by Gus Heck <gu...@gmail.com>.
Perhaps the rule based replica placement stuff would do the trick?

https://lucene.apache.org/solr/guide/7_3/rule-based-replica-placement.html

I haven't used it myself but I've seen lots of work going into it lately...

On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <Ad...@nordstrom.com>
wrote:

> Hi all,
>
> Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to 7.3
> Currently we have a working scale-up approach for adding a new server to
> the cluster beyond the initial collection creation.
> We’ve automated the install of Solr on new servers and, following that, we
> register the new instance with zookeeper so that the server will be
> included in the list of live nodes.
> Finally we use the CoreAdmin API ‘Create’ command to associate the new
> node with our collection. Solr 5.1's CoreAdmin Create command would
> conveniently auto-assign the new node to the shard with the least nodes.
>
> In Solr 7.3, the CoreAdmin API documentation warns us not to use the
> Create command with SolrCloud.
> We tried 7.3’s CoreAdmin API Create command regardless and,
> unsurprisingly, it did not work.
> The 7.3 documentation suggests we use the Collections API AddReplica
> command.The problem with AddReplica is that it expects us to specify the
> shard name.
> This is unfortunate as it makes it hard for us to keep shards balanced. It
> puts the onus on us to work out the least populated shard via a call to the
> cluster status endpoint.
> With that we now face the problem managing this correctly when scaling up
> multiple servers at once.
>
> Are we missing something here? Is there really no way for a node to be
> auto-assigned to a shard in 7.3?
> And if so, are there any recommendations for an approach to reliably doing
> this ourselves?
>
> Thanks!
> Adam
>



-- 
http://www.the111shift.com