You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@solr.apache.org by Pierre Salagnac <pi...@gmail.com> on 2023/03/06 15:27:27 UTC

Re: preferredLeader is useless during election or node restart

I discussed this issue offline with David, and I'm now working on a code
change to make the preferredLeader to become the leader when we register a
replica.

The idea is, when we register a replica from Zookeeper, we check whether it
has the preferred leader flag. When true, we tell the current leader to
stop being the leader right after we joined the election queue. This can we
done by sending REJOINLEADERELECTION command to the current leader.

Now, my issue is this works great with 2 replicas. When having 3 or more,
the REJOINLEADERELECTION does not have the intended effect.
Looking deeper in LeaderElector class, I figured out the preferred leader
replica does not join the election queue right after the current leader. It
usually joins as second in the queue (one more between the current leader
and where we join the queue).

Then, the RebalanceLeader command moves all the candidates with the same
sequence number as the preferred leader to the end of the queue.


=> So my question is: for the preferred leader, why don't we join the
election right after the current leader?


Current implementation of the RebalanceLeaders commands is:
- if not already the case, ask the preferred leader to rejoin at head
- ask all nodes with same sequence number as the preferred leader to rejoin
at end of the queue
- ask current leader to rejoin at end of the queue

By using the same sequence number as the current leader, we would not have
to ask several nodes to rejoin at the end of the queue in most of the cases.
For most of the cases, RebalanceLeaders command would just be:
- if not already the case, ask preferred leader to rejoin at head
- ask current leader to rejoin at end of the queue

We should keep the logic of checking other nodes with the same sequence
number, but no such nodes will not exist in most of the cases.

Le lun. 27 févr. 2023 à 18:42, David Smiley <ds...@apache.org> a écrit :

> I found this existing issue:
> https://issues.apache.org/jira/browse/SOLR-8238
> I commented on it just now.  Erick isn't around anymore but I'd appreciate
> input from anyone using "preferredLeader".
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Feb 20, 2023 at 12:49 PM David Smiley <ds...@apache.org> wrote:
>
> > Seems like a bug to me!
> > Recommended reading: https://issues.apache.org/jira/browse/SOLR-6491
> > There's a treasure trove of information in JIRA to learn about how code
> > comes to be; what were the intentions behind features; what alternatives
> > were explored; pros & cons.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Mon, Feb 20, 2023 at 9:04 AM Bruno Roustant <bruno.roustant@gmail.com
> >
> > wrote:
> >
> >> After many tests and deployments, it appears the preferredLeader flag
> >> described in the RebalanceLeader command doc [1] is not useful.
> >> It is taken into account only during the rebalance command. Afterwards,
> if
> >> there is a leader election or some node restart, it is ignored.
> >>
> >> Is this preferredLeader useless?
> >> I thought to use it to make leadership kind of sticky, but in practice
> the
> >> leadership assignment quickly returns to randomness. So, what was the
> >> purpose of this flag for the rebalance command, really a one-shot leader
> >> assignment, ignored after? Or is it a bug?
> >>
> >> Indeed only the rebalance leader command doc talks about this replica
> >> property. It is not mentioned elsewhere. But if it is ignored elsewhere,
> >> it's not of a great help.
> >>
> >> Should I enter a bug on preferredLeader property not respected during
> >> leader elections?
> >>
> >> [1]
> >>
> >>
> https://solr.apache.org/guide/solr/latest/deployment-guide/collection-management.html#rebalanceleaders
> >>
> >
>