You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2014/12/03 01:30:51 UTC

I think rejoin leader elections at the head isn't doing what it should

I'm particularly interested in Noble and Mark's comments...

Let's say you have 5 nodes in n1, n2, n3, n4, n5.

n1 is the leader, n2 watches n1 etc.

Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
watching n1. So far, so good.

My expectation is that deleting n1 would cause n3 to become leader,
but it isn't at all guaranteed. I have a test case illustrating this.

Incidentally, I think I should get the same result by calling
retryElection on n1 with joinAtHead=false; n3 should become the
leader.

I was working on SOLR-6691 and slowly going crazy since everything I
was trying would fail. Basically, to rebalance leaders (thanks Noble
for pointing out how far off I was in my original approach) it seemed
like it would be sufficient to

1> have the preferred leader retry the election at the head
2> tell the old leader to retry at the tail

I expected the old node that was watching the leader to figure out
that it wasn't really next in line and re-add itself to the end.

But things went all to hell in a handbasket when I wrote a harness
that exercised it, and it drove me a bit nuts. Especially since it
would fail one way one time and another way the next. And it'd even
succeed upon occasion....

I figured out that my expectations weren't being met. Due to the way
leader queues are sorted, if the two sequence numbers are identical
then the tie-breaker does NOT pick the last node to join at head.  It
picks the one with the lowest (highest? didn't track that down
entirely) session ID. Either way, sometimes it picks the node newly
added at the head and sometimes it picks the old one.

If I _am_ on the right path, then I propose the following:
1> I'll raise a new JIRA for leader sequence sorting and take it on.
I'm not quite sure how fix it, the ideas I have are fairly hacky.

2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
things badly and we're too close to 5.0 to try to do anything about
<1> IMO. this just means that I'll comment out the collections API
call in the code and update the ref guide.

3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
won't be before 5.1

Erick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Erick Erickson <er...@gmail.com>.
Noble:

Yep. but check me out on this, so far every "flaw" I've found has been
shooting myself in the foot! Actually, I'll make a new comment on 6691
so we have a record there.

Thanks!
Erick

On Sun, Dec 7, 2014 at 11:15 PM, Noble Paul <no...@gmail.com> wrote:
> "Ideally, I'd like to just tell a node to rejoin at head and have to do
> nothing else."
>
> The rejoin-at-head is an internal API which is used by other APIs (and not
> exposed to others). So , in that way it is a ready-to-cook API and not a
> ready-to-eat one. So, use it with caution.
>
> The entity that triggers the API should choreograph the entire sequence. Any
> failure in between should be handled properly
>
> On Sat, Dec 6, 2014 at 2:25 AM, Erick Erickson <er...@gmail.com>
> wrote:
>>
>> Ahhh, I wasn't too clear.
>>
>> Ideally, I'd like to just tell a node to rejoin at head and have to do
>> nothing else. Specifically, not have to tell the old first-in-line to
>> rejoin at tail.
>>
>> If I do _not_ do the second step, i.e. tell the old first-in-line to
>> rejoin at tail and _do_ tell the leader to rejoin at tail, both the
>> old first-in-line and node that rejoined at head's watchers get
>> triggered, and their sequence IDs are identical. So which one wins
>> relies on the fallback comparison of the entire election node which
>> starts with the session ID. Thus my comment that "it all depends on
>> the session ID that's associated".
>>
>> You're right in that there's always a first in line and it's a
>> determinate algorithm. And I can get the behavior I want by doing the
>> step I was omitting, i.e. tell the old first-in-line to rejoin at
>> tail. And to have the behavior I was hoping for (i.e. no need to tell
>> the old first-in-line to rejoin at tail) requires reworking the leader
>> election code, which as you well know isn't something to be approached
>> lightly.
>>
>> And I don't intend to even try that after looking at that code for a
>> while. I mean saving myself the "trouble" of issuing the rejoin at
>> tail isn't even close to worth the risk.....
>>
>> On Fri, Dec 5, 2014 at 9:32 AM, Noble Paul <no...@gmail.com> wrote:
>> > "and that's not
>> > guaranteed currently unless one deletes the old first-in-line."
>> >
>> > Yeah, that is what I said in the final step. ask n1 (the current leader)
>> > to
>> > rejoin election.
>> > The rejoin command always makes a node join at TAIL and  rejoinAtHead
>> > makes
>> > a node join right behind the current HEAD
>> >
>> >
>> > "If one doesn't have the former first-in-line go to the tail, "
>> >
>> > I fail to understand this. There will be always a node that is first in
>> > line
>> > (as long as there is a line)
>> >
>> > "it all depends on the session ID that's associated"
>> >  really? how?
>> >
>> > On Fri, Dec 5, 2014 at 10:43 PM, Erick Erickson
>> > <er...@gmail.com>
>> > wrote:
>> >>
>> >> Noble:
>> >>
>> >> Thanks, that'll probably solve my immediate problem, but it still
>> >> seems flawed. I should be able to specify "rejoin at head" on a
>> >> particular node and next time a leader is elected the node I told to
>> >> rejoin at head _should_ be the one that comes up, and that's not
>> >> guaranteed currently unless one deletes the old first-in-line.
>> >>
>> >> If one doesn't have the former first-in-line go to the tail, then
>> >> depending on the sub-ordering the node I told to rejoin at head may or
>> >> may not become leader, it all depends on the session ID that's
>> >> associated. So in general any time anything rejoins at head the second
>> >> call to delete the old first-in-line is required.
>> >>
>> >> That said, if this work it'll solve my immediate problem without
>> >> getting into all the leader re-election code which, as you mentioned
>> >> before, is pretty difficult to get right.
>> >>
>> >> Thanks!
>> >> Erick
>> >>
>> >> On Thu, Dec 4, 2014 at 6:06 PM, Noble Paul <no...@gmail.com>
>> >> wrote:
>> >> > "Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>> >> >
>> >> > n1 is the leader, n2 watches n1 etc.
>> >> >
>> >> > Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
>> >> > watching n1. So far, so good.
>> >> >
>> >> > My expectation is that deleting n1 would cause n3 to become leader,
>> >> > but it isn't at all guaranteed. I have a test case illustrating this"
>> >> >
>> >> >
>> >> > deleting n1 is not enough
>> >> >
>> >> > before that, you should ask n2 to rejoin election (joinAthead=false).
>> >> > This
>> >> > will ensure that n2 is at tail now. Now the order is n1,n3,n4....
>> >> > now ask n1 to rejoin (not at head) and it will join back at tail and
>> >> > n3
>> >> > will
>> >> > become leader
>> >> >
>> >> > On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson
>> >> > <er...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Thanks! I somewhat remember seeing that conversation but I confess I
>> >> >> didn't follow it that closely.
>> >> >>
>> >> >> I can't cope with looking at it any more tonight, but I'll check in
>> >> >> the morning. The problem I see is I don't think there's any way,
>> >> >> once
>> >> >> a node is re-inserted in the queue, for another node to figure out
>> >> >> that it's not supposed to be the leader if it's first in line after
>> >> >> the nodes are sorted, but I may have missed that.
>> >> >>
>> >> >> Erick
>> >> >>
>> >> >> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet
>> >> >> <me...@gmail.com>
>> >> >> wrote:
>> >> >> > This is reminiscent of my conversation with Noble on this
>> >> >> > SOLR-6095
>> >> >> > starting
>> >> >> > at this comment:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
>> >> >> >
>> >> >> > Unfortunately I dropped off following it and my memory is a bit
>> >> >> > vague
>> >> >> > right
>> >> >> > now. Reading from the comments, I think Noble had in mind that the
>> >> >> > tie-breaker can pick the wrong node (n2) to be the leader, but
>> >> >> > then
>> >> >> > the
>> >> >> > wrong node will then re-initiate the process to renounce
>> >> >> > leadership
>> >> >> > and
>> >> >> > re-join (according to
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619).
>> >> >> >
>> >> >> > I then asked about when that renounce process will happen for n2
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > (https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659),
>> >> >> > and I'm not sure if that was ever specifically answered. Figuring
>> >> >> > if
>> >> >> > and
>> >> >> > how
>> >> >> > that happens might be key in moving forward?
>> >> >> >
>> >> >> > Jessica
>> >> >> >
>> >> >> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson
>> >> >> > <er...@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> I'm particularly interested in Noble and Mark's comments...
>> >> >> >>
>> >> >> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>> >> >> >>
>> >> >> >> n1 is the leader, n2 watches n1 etc.
>> >> >> >>
>> >> >> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3
>> >> >> >> are
>> >> >> >> watching n1. So far, so good.
>> >> >> >>
>> >> >> >> My expectation is that deleting n1 would cause n3 to become
>> >> >> >> leader,
>> >> >> >> but it isn't at all guaranteed. I have a test case illustrating
>> >> >> >> this.
>> >> >> >>
>> >> >> >> Incidentally, I think I should get the same result by calling
>> >> >> >> retryElection on n1 with joinAtHead=false; n3 should become the
>> >> >> >> leader.
>> >> >> >>
>> >> >> >> I was working on SOLR-6691 and slowly going crazy since
>> >> >> >> everything I
>> >> >> >> was trying would fail. Basically, to rebalance leaders (thanks
>> >> >> >> Noble
>> >> >> >> for pointing out how far off I was in my original approach) it
>> >> >> >> seemed
>> >> >> >> like it would be sufficient to
>> >> >> >>
>> >> >> >> 1> have the preferred leader retry the election at the head
>> >> >> >> 2> tell the old leader to retry at the tail
>> >> >> >>
>> >> >> >> I expected the old node that was watching the leader to figure
>> >> >> >> out
>> >> >> >> that it wasn't really next in line and re-add itself to the end.
>> >> >> >>
>> >> >> >> But things went all to hell in a handbasket when I wrote a
>> >> >> >> harness
>> >> >> >> that exercised it, and it drove me a bit nuts. Especially since
>> >> >> >> it
>> >> >> >> would fail one way one time and another way the next. And it'd
>> >> >> >> even
>> >> >> >> succeed upon occasion....
>> >> >> >>
>> >> >> >> I figured out that my expectations weren't being met. Due to the
>> >> >> >> way
>> >> >> >> leader queues are sorted, if the two sequence numbers are
>> >> >> >> identical
>> >> >> >> then the tie-breaker does NOT pick the last node to join at head.
>> >> >> >> It
>> >> >> >> picks the one with the lowest (highest? didn't track that down
>> >> >> >> entirely) session ID. Either way, sometimes it picks the node
>> >> >> >> newly
>> >> >> >> added at the head and sometimes it picks the old one.
>> >> >> >>
>> >> >> >> If I _am_ on the right path, then I propose the following:
>> >> >> >> 1> I'll raise a new JIRA for leader sequence sorting and take it
>> >> >> >> on.
>> >> >> >> I'm not quite sure how fix it, the ideas I have are fairly hacky.
>> >> >> >>
>> >> >> >> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll
>> >> >> >> break
>> >> >> >> things badly and we're too close to 5.0 to try to do anything
>> >> >> >> about
>> >> >> >> <1> IMO. this just means that I'll comment out the collections
>> >> >> >> API
>> >> >> >> call in the code and update the ref guide.
>> >> >> >>
>> >> >> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but
>> >> >> >> that
>> >> >> >> won't be before 5.1
>> >> >> >>
>> >> >> >> Erick
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > -----------------------------------------------------
>> >> > Noble Paul
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > -----------------------------------------------------
>> > Noble Paul
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> -----------------------------------------------------
> Noble Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Noble Paul <no...@gmail.com>.
"Ideally, I'd like to just tell a node to rejoin at head and have to do
nothing else."

The rejoin-at-head is an internal API which is used by other APIs (and not
exposed to others). So , in that way it is a ready-to-cook API and not a
ready-to-eat one. So, use it with caution.

The entity that triggers the API should choreograph the entire sequence.
Any failure in between should be handled properly

On Sat, Dec 6, 2014 at 2:25 AM, Erick Erickson <er...@gmail.com>
wrote:

> Ahhh, I wasn't too clear.
>
> Ideally, I'd like to just tell a node to rejoin at head and have to do
> nothing else. Specifically, not have to tell the old first-in-line to
> rejoin at tail.
>
> If I do _not_ do the second step, i.e. tell the old first-in-line to
> rejoin at tail and _do_ tell the leader to rejoin at tail, both the
> old first-in-line and node that rejoined at head's watchers get
> triggered, and their sequence IDs are identical. So which one wins
> relies on the fallback comparison of the entire election node which
> starts with the session ID. Thus my comment that "it all depends on
> the session ID that's associated".
>
> You're right in that there's always a first in line and it's a
> determinate algorithm. And I can get the behavior I want by doing the
> step I was omitting, i.e. tell the old first-in-line to rejoin at
> tail. And to have the behavior I was hoping for (i.e. no need to tell
> the old first-in-line to rejoin at tail) requires reworking the leader
> election code, which as you well know isn't something to be approached
> lightly.
>
> And I don't intend to even try that after looking at that code for a
> while. I mean saving myself the "trouble" of issuing the rejoin at
> tail isn't even close to worth the risk.....
>
> On Fri, Dec 5, 2014 at 9:32 AM, Noble Paul <no...@gmail.com> wrote:
> > "and that's not
> > guaranteed currently unless one deletes the old first-in-line."
> >
> > Yeah, that is what I said in the final step. ask n1 (the current leader)
> to
> > rejoin election.
> > The rejoin command always makes a node join at TAIL and  rejoinAtHead
> makes
> > a node join right behind the current HEAD
> >
> >
> > "If one doesn't have the former first-in-line go to the tail, "
> >
> > I fail to understand this. There will be always a node that is first in
> line
> > (as long as there is a line)
> >
> > "it all depends on the session ID that's associated"
> >  really? how?
> >
> > On Fri, Dec 5, 2014 at 10:43 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >>
> >> Noble:
> >>
> >> Thanks, that'll probably solve my immediate problem, but it still
> >> seems flawed. I should be able to specify "rejoin at head" on a
> >> particular node and next time a leader is elected the node I told to
> >> rejoin at head _should_ be the one that comes up, and that's not
> >> guaranteed currently unless one deletes the old first-in-line.
> >>
> >> If one doesn't have the former first-in-line go to the tail, then
> >> depending on the sub-ordering the node I told to rejoin at head may or
> >> may not become leader, it all depends on the session ID that's
> >> associated. So in general any time anything rejoins at head the second
> >> call to delete the old first-in-line is required.
> >>
> >> That said, if this work it'll solve my immediate problem without
> >> getting into all the leader re-election code which, as you mentioned
> >> before, is pretty difficult to get right.
> >>
> >> Thanks!
> >> Erick
> >>
> >> On Thu, Dec 4, 2014 at 6:06 PM, Noble Paul <no...@gmail.com>
> wrote:
> >> > "Let's say you have 5 nodes in n1, n2, n3, n4, n5.
> >> >
> >> > n1 is the leader, n2 watches n1 etc.
> >> >
> >> > Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> >> > watching n1. So far, so good.
> >> >
> >> > My expectation is that deleting n1 would cause n3 to become leader,
> >> > but it isn't at all guaranteed. I have a test case illustrating this"
> >> >
> >> >
> >> > deleting n1 is not enough
> >> >
> >> > before that, you should ask n2 to rejoin election (joinAthead=false).
> >> > This
> >> > will ensure that n2 is at tail now. Now the order is n1,n3,n4....
> >> > now ask n1 to rejoin (not at head) and it will join back at tail and
> n3
> >> > will
> >> > become leader
> >> >
> >> > On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson <
> erickerickson@gmail.com>
> >> > wrote:
> >> >>
> >> >> Thanks! I somewhat remember seeing that conversation but I confess I
> >> >> didn't follow it that closely.
> >> >>
> >> >> I can't cope with looking at it any more tonight, but I'll check in
> >> >> the morning. The problem I see is I don't think there's any way, once
> >> >> a node is re-inserted in the queue, for another node to figure out
> >> >> that it's not supposed to be the leader if it's first in line after
> >> >> the nodes are sorted, but I may have missed that.
> >> >>
> >> >> Erick
> >> >>
> >> >> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet <mewmewball@gmail.com
> >
> >> >> wrote:
> >> >> > This is reminiscent of my conversation with Noble on this SOLR-6095
> >> >> > starting
> >> >> > at this comment:
> >> >> >
> >> >> >
> >> >> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
> >> >> >
> >> >> > Unfortunately I dropped off following it and my memory is a bit
> vague
> >> >> > right
> >> >> > now. Reading from the comments, I think Noble had in mind that the
> >> >> > tie-breaker can pick the wrong node (n2) to be the leader, but then
> >> >> > the
> >> >> > wrong node will then re-initiate the process to renounce leadership
> >> >> > and
> >> >> > re-join (according to
> >> >> >
> >> >> >
> >> >> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619
> ).
> >> >> >
> >> >> > I then asked about when that renounce process will happen for n2
> >> >> >
> >> >> >
> >> >> > (
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659
> ),
> >> >> > and I'm not sure if that was ever specifically answered. Figuring
> if
> >> >> > and
> >> >> > how
> >> >> > that happens might be key in moving forward?
> >> >> >
> >> >> > Jessica
> >> >> >
> >> >> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson
> >> >> > <er...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> I'm particularly interested in Noble and Mark's comments...
> >> >> >>
> >> >> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
> >> >> >>
> >> >> >> n1 is the leader, n2 watches n1 etc.
> >> >> >>
> >> >> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3
> are
> >> >> >> watching n1. So far, so good.
> >> >> >>
> >> >> >> My expectation is that deleting n1 would cause n3 to become
> leader,
> >> >> >> but it isn't at all guaranteed. I have a test case illustrating
> >> >> >> this.
> >> >> >>
> >> >> >> Incidentally, I think I should get the same result by calling
> >> >> >> retryElection on n1 with joinAtHead=false; n3 should become the
> >> >> >> leader.
> >> >> >>
> >> >> >> I was working on SOLR-6691 and slowly going crazy since
> everything I
> >> >> >> was trying would fail. Basically, to rebalance leaders (thanks
> Noble
> >> >> >> for pointing out how far off I was in my original approach) it
> >> >> >> seemed
> >> >> >> like it would be sufficient to
> >> >> >>
> >> >> >> 1> have the preferred leader retry the election at the head
> >> >> >> 2> tell the old leader to retry at the tail
> >> >> >>
> >> >> >> I expected the old node that was watching the leader to figure out
> >> >> >> that it wasn't really next in line and re-add itself to the end.
> >> >> >>
> >> >> >> But things went all to hell in a handbasket when I wrote a harness
> >> >> >> that exercised it, and it drove me a bit nuts. Especially since it
> >> >> >> would fail one way one time and another way the next. And it'd
> even
> >> >> >> succeed upon occasion....
> >> >> >>
> >> >> >> I figured out that my expectations weren't being met. Due to the
> way
> >> >> >> leader queues are sorted, if the two sequence numbers are
> identical
> >> >> >> then the tie-breaker does NOT pick the last node to join at head.
> >> >> >> It
> >> >> >> picks the one with the lowest (highest? didn't track that down
> >> >> >> entirely) session ID. Either way, sometimes it picks the node
> newly
> >> >> >> added at the head and sometimes it picks the old one.
> >> >> >>
> >> >> >> If I _am_ on the right path, then I propose the following:
> >> >> >> 1> I'll raise a new JIRA for leader sequence sorting and take it
> on.
> >> >> >> I'm not quite sure how fix it, the ideas I have are fairly hacky.
> >> >> >>
> >> >> >> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
> >> >> >> things badly and we're too close to 5.0 to try to do anything
> about
> >> >> >> <1> IMO. this just means that I'll comment out the collections API
> >> >> >> call in the code and update the ref guide.
> >> >> >>
> >> >> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but
> that
> >> >> >> won't be before 5.1
> >> >> >>
> >> >> >> Erick
> >> >> >>
> >> >> >>
> >> >> >>
> ---------------------------------------------------------------------
> >> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >> >>
> >> >> >
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > -----------------------------------------------------
> >> > Noble Paul
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> >
> >
> > --
> > -----------------------------------------------------
> > Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
-----------------------------------------------------
Noble Paul

Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Erick Erickson <er...@gmail.com>.
Ahhh, I wasn't too clear.

Ideally, I'd like to just tell a node to rejoin at head and have to do
nothing else. Specifically, not have to tell the old first-in-line to
rejoin at tail.

If I do _not_ do the second step, i.e. tell the old first-in-line to
rejoin at tail and _do_ tell the leader to rejoin at tail, both the
old first-in-line and node that rejoined at head's watchers get
triggered, and their sequence IDs are identical. So which one wins
relies on the fallback comparison of the entire election node which
starts with the session ID. Thus my comment that "it all depends on
the session ID that's associated".

You're right in that there's always a first in line and it's a
determinate algorithm. And I can get the behavior I want by doing the
step I was omitting, i.e. tell the old first-in-line to rejoin at
tail. And to have the behavior I was hoping for (i.e. no need to tell
the old first-in-line to rejoin at tail) requires reworking the leader
election code, which as you well know isn't something to be approached
lightly.

And I don't intend to even try that after looking at that code for a
while. I mean saving myself the "trouble" of issuing the rejoin at
tail isn't even close to worth the risk.....

On Fri, Dec 5, 2014 at 9:32 AM, Noble Paul <no...@gmail.com> wrote:
> "and that's not
> guaranteed currently unless one deletes the old first-in-line."
>
> Yeah, that is what I said in the final step. ask n1 (the current leader) to
> rejoin election.
> The rejoin command always makes a node join at TAIL and  rejoinAtHead makes
> a node join right behind the current HEAD
>
>
> "If one doesn't have the former first-in-line go to the tail, "
>
> I fail to understand this. There will be always a node that is first in line
> (as long as there is a line)
>
> "it all depends on the session ID that's associated"
>  really? how?
>
> On Fri, Dec 5, 2014 at 10:43 PM, Erick Erickson <er...@gmail.com>
> wrote:
>>
>> Noble:
>>
>> Thanks, that'll probably solve my immediate problem, but it still
>> seems flawed. I should be able to specify "rejoin at head" on a
>> particular node and next time a leader is elected the node I told to
>> rejoin at head _should_ be the one that comes up, and that's not
>> guaranteed currently unless one deletes the old first-in-line.
>>
>> If one doesn't have the former first-in-line go to the tail, then
>> depending on the sub-ordering the node I told to rejoin at head may or
>> may not become leader, it all depends on the session ID that's
>> associated. So in general any time anything rejoins at head the second
>> call to delete the old first-in-line is required.
>>
>> That said, if this work it'll solve my immediate problem without
>> getting into all the leader re-election code which, as you mentioned
>> before, is pretty difficult to get right.
>>
>> Thanks!
>> Erick
>>
>> On Thu, Dec 4, 2014 at 6:06 PM, Noble Paul <no...@gmail.com> wrote:
>> > "Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>> >
>> > n1 is the leader, n2 watches n1 etc.
>> >
>> > Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
>> > watching n1. So far, so good.
>> >
>> > My expectation is that deleting n1 would cause n3 to become leader,
>> > but it isn't at all guaranteed. I have a test case illustrating this"
>> >
>> >
>> > deleting n1 is not enough
>> >
>> > before that, you should ask n2 to rejoin election (joinAthead=false).
>> > This
>> > will ensure that n2 is at tail now. Now the order is n1,n3,n4....
>> > now ask n1 to rejoin (not at head) and it will join back at tail and n3
>> > will
>> > become leader
>> >
>> > On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson <er...@gmail.com>
>> > wrote:
>> >>
>> >> Thanks! I somewhat remember seeing that conversation but I confess I
>> >> didn't follow it that closely.
>> >>
>> >> I can't cope with looking at it any more tonight, but I'll check in
>> >> the morning. The problem I see is I don't think there's any way, once
>> >> a node is re-inserted in the queue, for another node to figure out
>> >> that it's not supposed to be the leader if it's first in line after
>> >> the nodes are sorted, but I may have missed that.
>> >>
>> >> Erick
>> >>
>> >> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet <me...@gmail.com>
>> >> wrote:
>> >> > This is reminiscent of my conversation with Noble on this SOLR-6095
>> >> > starting
>> >> > at this comment:
>> >> >
>> >> >
>> >> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
>> >> >
>> >> > Unfortunately I dropped off following it and my memory is a bit vague
>> >> > right
>> >> > now. Reading from the comments, I think Noble had in mind that the
>> >> > tie-breaker can pick the wrong node (n2) to be the leader, but then
>> >> > the
>> >> > wrong node will then re-initiate the process to renounce leadership
>> >> > and
>> >> > re-join (according to
>> >> >
>> >> >
>> >> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619).
>> >> >
>> >> > I then asked about when that renounce process will happen for n2
>> >> >
>> >> >
>> >> > (https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659),
>> >> > and I'm not sure if that was ever specifically answered. Figuring if
>> >> > and
>> >> > how
>> >> > that happens might be key in moving forward?
>> >> >
>> >> > Jessica
>> >> >
>> >> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson
>> >> > <er...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> I'm particularly interested in Noble and Mark's comments...
>> >> >>
>> >> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>> >> >>
>> >> >> n1 is the leader, n2 watches n1 etc.
>> >> >>
>> >> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
>> >> >> watching n1. So far, so good.
>> >> >>
>> >> >> My expectation is that deleting n1 would cause n3 to become leader,
>> >> >> but it isn't at all guaranteed. I have a test case illustrating
>> >> >> this.
>> >> >>
>> >> >> Incidentally, I think I should get the same result by calling
>> >> >> retryElection on n1 with joinAtHead=false; n3 should become the
>> >> >> leader.
>> >> >>
>> >> >> I was working on SOLR-6691 and slowly going crazy since everything I
>> >> >> was trying would fail. Basically, to rebalance leaders (thanks Noble
>> >> >> for pointing out how far off I was in my original approach) it
>> >> >> seemed
>> >> >> like it would be sufficient to
>> >> >>
>> >> >> 1> have the preferred leader retry the election at the head
>> >> >> 2> tell the old leader to retry at the tail
>> >> >>
>> >> >> I expected the old node that was watching the leader to figure out
>> >> >> that it wasn't really next in line and re-add itself to the end.
>> >> >>
>> >> >> But things went all to hell in a handbasket when I wrote a harness
>> >> >> that exercised it, and it drove me a bit nuts. Especially since it
>> >> >> would fail one way one time and another way the next. And it'd even
>> >> >> succeed upon occasion....
>> >> >>
>> >> >> I figured out that my expectations weren't being met. Due to the way
>> >> >> leader queues are sorted, if the two sequence numbers are identical
>> >> >> then the tie-breaker does NOT pick the last node to join at head.
>> >> >> It
>> >> >> picks the one with the lowest (highest? didn't track that down
>> >> >> entirely) session ID. Either way, sometimes it picks the node newly
>> >> >> added at the head and sometimes it picks the old one.
>> >> >>
>> >> >> If I _am_ on the right path, then I propose the following:
>> >> >> 1> I'll raise a new JIRA for leader sequence sorting and take it on.
>> >> >> I'm not quite sure how fix it, the ideas I have are fairly hacky.
>> >> >>
>> >> >> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
>> >> >> things badly and we're too close to 5.0 to try to do anything about
>> >> >> <1> IMO. this just means that I'll comment out the collections API
>> >> >> call in the code and update the ref guide.
>> >> >>
>> >> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
>> >> >> won't be before 5.1
>> >> >>
>> >> >> Erick
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > -----------------------------------------------------
>> > Noble Paul
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> -----------------------------------------------------
> Noble Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Noble Paul <no...@gmail.com>.
"and that's not
guaranteed currently unless one deletes the old first-in-line."

Yeah, that is what I said in the final step. ask n1 (the current leader) to
rejoin election.
The rejoin command always makes a node join at TAIL and  rejoinAtHead makes
a node join right behind the current HEAD


"If one doesn't have the former first-in-line go to the tail, "

I fail to understand this. There will be always a node that is first in
line (as long as there is a line)

"it all depends on the session ID that's associated"
 really? how?

On Fri, Dec 5, 2014 at 10:43 PM, Erick Erickson <er...@gmail.com>
wrote:

> Noble:
>
> Thanks, that'll probably solve my immediate problem, but it still
> seems flawed. I should be able to specify "rejoin at head" on a
> particular node and next time a leader is elected the node I told to
> rejoin at head _should_ be the one that comes up, and that's not
> guaranteed currently unless one deletes the old first-in-line.
>
> If one doesn't have the former first-in-line go to the tail, then
> depending on the sub-ordering the node I told to rejoin at head may or
> may not become leader, it all depends on the session ID that's
> associated. So in general any time anything rejoins at head the second
> call to delete the old first-in-line is required.
>
> That said, if this work it'll solve my immediate problem without
> getting into all the leader re-election code which, as you mentioned
> before, is pretty difficult to get right.
>
> Thanks!
> Erick
>
> On Thu, Dec 4, 2014 at 6:06 PM, Noble Paul <no...@gmail.com> wrote:
> > "Let's say you have 5 nodes in n1, n2, n3, n4, n5.
> >
> > n1 is the leader, n2 watches n1 etc.
> >
> > Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> > watching n1. So far, so good.
> >
> > My expectation is that deleting n1 would cause n3 to become leader,
> > but it isn't at all guaranteed. I have a test case illustrating this"
> >
> >
> > deleting n1 is not enough
> >
> > before that, you should ask n2 to rejoin election (joinAthead=false).
> This
> > will ensure that n2 is at tail now. Now the order is n1,n3,n4....
> > now ask n1 to rejoin (not at head) and it will join back at tail and n3
> will
> > become leader
> >
> > On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson <er...@gmail.com>
> > wrote:
> >>
> >> Thanks! I somewhat remember seeing that conversation but I confess I
> >> didn't follow it that closely.
> >>
> >> I can't cope with looking at it any more tonight, but I'll check in
> >> the morning. The problem I see is I don't think there's any way, once
> >> a node is re-inserted in the queue, for another node to figure out
> >> that it's not supposed to be the leader if it's first in line after
> >> the nodes are sorted, but I may have missed that.
> >>
> >> Erick
> >>
> >> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet <me...@gmail.com>
> >> wrote:
> >> > This is reminiscent of my conversation with Noble on this SOLR-6095
> >> > starting
> >> > at this comment:
> >> >
> >> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
> >> >
> >> > Unfortunately I dropped off following it and my memory is a bit vague
> >> > right
> >> > now. Reading from the comments, I think Noble had in mind that the
> >> > tie-breaker can pick the wrong node (n2) to be the leader, but then
> the
> >> > wrong node will then re-initiate the process to renounce leadership
> and
> >> > re-join (according to
> >> >
> >> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619
> ).
> >> >
> >> > I then asked about when that renounce process will happen for n2
> >> >
> >> > (
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659
> ),
> >> > and I'm not sure if that was ever specifically answered. Figuring if
> and
> >> > how
> >> > that happens might be key in moving forward?
> >> >
> >> > Jessica
> >> >
> >> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson <
> erickerickson@gmail.com>
> >> > wrote:
> >> >>
> >> >> I'm particularly interested in Noble and Mark's comments...
> >> >>
> >> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
> >> >>
> >> >> n1 is the leader, n2 watches n1 etc.
> >> >>
> >> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> >> >> watching n1. So far, so good.
> >> >>
> >> >> My expectation is that deleting n1 would cause n3 to become leader,
> >> >> but it isn't at all guaranteed. I have a test case illustrating this.
> >> >>
> >> >> Incidentally, I think I should get the same result by calling
> >> >> retryElection on n1 with joinAtHead=false; n3 should become the
> >> >> leader.
> >> >>
> >> >> I was working on SOLR-6691 and slowly going crazy since everything I
> >> >> was trying would fail. Basically, to rebalance leaders (thanks Noble
> >> >> for pointing out how far off I was in my original approach) it seemed
> >> >> like it would be sufficient to
> >> >>
> >> >> 1> have the preferred leader retry the election at the head
> >> >> 2> tell the old leader to retry at the tail
> >> >>
> >> >> I expected the old node that was watching the leader to figure out
> >> >> that it wasn't really next in line and re-add itself to the end.
> >> >>
> >> >> But things went all to hell in a handbasket when I wrote a harness
> >> >> that exercised it, and it drove me a bit nuts. Especially since it
> >> >> would fail one way one time and another way the next. And it'd even
> >> >> succeed upon occasion....
> >> >>
> >> >> I figured out that my expectations weren't being met. Due to the way
> >> >> leader queues are sorted, if the two sequence numbers are identical
> >> >> then the tie-breaker does NOT pick the last node to join at head.  It
> >> >> picks the one with the lowest (highest? didn't track that down
> >> >> entirely) session ID. Either way, sometimes it picks the node newly
> >> >> added at the head and sometimes it picks the old one.
> >> >>
> >> >> If I _am_ on the right path, then I propose the following:
> >> >> 1> I'll raise a new JIRA for leader sequence sorting and take it on.
> >> >> I'm not quite sure how fix it, the ideas I have are fairly hacky.
> >> >>
> >> >> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
> >> >> things badly and we're too close to 5.0 to try to do anything about
> >> >> <1> IMO. this just means that I'll comment out the collections API
> >> >> call in the code and update the ref guide.
> >> >>
> >> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
> >> >> won't be before 5.1
> >> >>
> >> >> Erick
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> >
> >
> > --
> > -----------------------------------------------------
> > Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
-----------------------------------------------------
Noble Paul

Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Erick Erickson <er...@gmail.com>.
Noble:

Thanks, that'll probably solve my immediate problem, but it still
seems flawed. I should be able to specify "rejoin at head" on a
particular node and next time a leader is elected the node I told to
rejoin at head _should_ be the one that comes up, and that's not
guaranteed currently unless one deletes the old first-in-line.

If one doesn't have the former first-in-line go to the tail, then
depending on the sub-ordering the node I told to rejoin at head may or
may not become leader, it all depends on the session ID that's
associated. So in general any time anything rejoins at head the second
call to delete the old first-in-line is required.

That said, if this work it'll solve my immediate problem without
getting into all the leader re-election code which, as you mentioned
before, is pretty difficult to get right.

Thanks!
Erick

On Thu, Dec 4, 2014 at 6:06 PM, Noble Paul <no...@gmail.com> wrote:
> "Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>
> n1 is the leader, n2 watches n1 etc.
>
> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> watching n1. So far, so good.
>
> My expectation is that deleting n1 would cause n3 to become leader,
> but it isn't at all guaranteed. I have a test case illustrating this"
>
>
> deleting n1 is not enough
>
> before that, you should ask n2 to rejoin election (joinAthead=false). This
> will ensure that n2 is at tail now. Now the order is n1,n3,n4....
> now ask n1 to rejoin (not at head) and it will join back at tail and n3 will
> become leader
>
> On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson <er...@gmail.com>
> wrote:
>>
>> Thanks! I somewhat remember seeing that conversation but I confess I
>> didn't follow it that closely.
>>
>> I can't cope with looking at it any more tonight, but I'll check in
>> the morning. The problem I see is I don't think there's any way, once
>> a node is re-inserted in the queue, for another node to figure out
>> that it's not supposed to be the leader if it's first in line after
>> the nodes are sorted, but I may have missed that.
>>
>> Erick
>>
>> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet <me...@gmail.com>
>> wrote:
>> > This is reminiscent of my conversation with Noble on this SOLR-6095
>> > starting
>> > at this comment:
>> >
>> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
>> >
>> > Unfortunately I dropped off following it and my memory is a bit vague
>> > right
>> > now. Reading from the comments, I think Noble had in mind that the
>> > tie-breaker can pick the wrong node (n2) to be the leader, but then the
>> > wrong node will then re-initiate the process to renounce leadership and
>> > re-join (according to
>> >
>> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619).
>> >
>> > I then asked about when that renounce process will happen for n2
>> >
>> > (https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659),
>> > and I'm not sure if that was ever specifically answered. Figuring if and
>> > how
>> > that happens might be key in moving forward?
>> >
>> > Jessica
>> >
>> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson <er...@gmail.com>
>> > wrote:
>> >>
>> >> I'm particularly interested in Noble and Mark's comments...
>> >>
>> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>> >>
>> >> n1 is the leader, n2 watches n1 etc.
>> >>
>> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
>> >> watching n1. So far, so good.
>> >>
>> >> My expectation is that deleting n1 would cause n3 to become leader,
>> >> but it isn't at all guaranteed. I have a test case illustrating this.
>> >>
>> >> Incidentally, I think I should get the same result by calling
>> >> retryElection on n1 with joinAtHead=false; n3 should become the
>> >> leader.
>> >>
>> >> I was working on SOLR-6691 and slowly going crazy since everything I
>> >> was trying would fail. Basically, to rebalance leaders (thanks Noble
>> >> for pointing out how far off I was in my original approach) it seemed
>> >> like it would be sufficient to
>> >>
>> >> 1> have the preferred leader retry the election at the head
>> >> 2> tell the old leader to retry at the tail
>> >>
>> >> I expected the old node that was watching the leader to figure out
>> >> that it wasn't really next in line and re-add itself to the end.
>> >>
>> >> But things went all to hell in a handbasket when I wrote a harness
>> >> that exercised it, and it drove me a bit nuts. Especially since it
>> >> would fail one way one time and another way the next. And it'd even
>> >> succeed upon occasion....
>> >>
>> >> I figured out that my expectations weren't being met. Due to the way
>> >> leader queues are sorted, if the two sequence numbers are identical
>> >> then the tie-breaker does NOT pick the last node to join at head.  It
>> >> picks the one with the lowest (highest? didn't track that down
>> >> entirely) session ID. Either way, sometimes it picks the node newly
>> >> added at the head and sometimes it picks the old one.
>> >>
>> >> If I _am_ on the right path, then I propose the following:
>> >> 1> I'll raise a new JIRA for leader sequence sorting and take it on.
>> >> I'm not quite sure how fix it, the ideas I have are fairly hacky.
>> >>
>> >> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
>> >> things badly and we're too close to 5.0 to try to do anything about
>> >> <1> IMO. this just means that I'll comment out the collections API
>> >> call in the code and update the ref guide.
>> >>
>> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
>> >> won't be before 5.1
>> >>
>> >> Erick
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> -----------------------------------------------------
> Noble Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Noble Paul <no...@gmail.com>.
"Let's say you have 5 nodes in n1, n2, n3, n4, n5.

n1 is the leader, n2 watches n1 etc.

Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
watching n1. So far, so good.

My expectation is that deleting n1 would cause n3 to become leader,
but it isn't at all guaranteed. I have a test case illustrating this"


deleting n1 is not enough

before that, you should ask n2 to rejoin election (joinAthead=false). This
will ensure that n2 is at tail now. Now the order is n1,n3,n4....
now ask n1 to rejoin (not at head) and it will join back at tail and n3
will become leader

On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson <er...@gmail.com>
wrote:

> Thanks! I somewhat remember seeing that conversation but I confess I
> didn't follow it that closely.
>
> I can't cope with looking at it any more tonight, but I'll check in
> the morning. The problem I see is I don't think there's any way, once
> a node is re-inserted in the queue, for another node to figure out
> that it's not supposed to be the leader if it's first in line after
> the nodes are sorted, but I may have missed that.
>
> Erick
>
> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet <me...@gmail.com>
> wrote:
> > This is reminiscent of my conversation with Noble on this SOLR-6095
> starting
> > at this comment:
> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
> >
> > Unfortunately I dropped off following it and my memory is a bit vague
> right
> > now. Reading from the comments, I think Noble had in mind that the
> > tie-breaker can pick the wrong node (n2) to be the leader, but then the
> > wrong node will then re-initiate the process to renounce leadership and
> > re-join (according to
> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619
> ).
> >
> > I then asked about when that renounce process will happen for n2
> > (
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659
> ),
> > and I'm not sure if that was ever specifically answered. Figuring if and
> how
> > that happens might be key in moving forward?
> >
> > Jessica
> >
> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson <er...@gmail.com>
> > wrote:
> >>
> >> I'm particularly interested in Noble and Mark's comments...
> >>
> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
> >>
> >> n1 is the leader, n2 watches n1 etc.
> >>
> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> >> watching n1. So far, so good.
> >>
> >> My expectation is that deleting n1 would cause n3 to become leader,
> >> but it isn't at all guaranteed. I have a test case illustrating this.
> >>
> >> Incidentally, I think I should get the same result by calling
> >> retryElection on n1 with joinAtHead=false; n3 should become the
> >> leader.
> >>
> >> I was working on SOLR-6691 and slowly going crazy since everything I
> >> was trying would fail. Basically, to rebalance leaders (thanks Noble
> >> for pointing out how far off I was in my original approach) it seemed
> >> like it would be sufficient to
> >>
> >> 1> have the preferred leader retry the election at the head
> >> 2> tell the old leader to retry at the tail
> >>
> >> I expected the old node that was watching the leader to figure out
> >> that it wasn't really next in line and re-add itself to the end.
> >>
> >> But things went all to hell in a handbasket when I wrote a harness
> >> that exercised it, and it drove me a bit nuts. Especially since it
> >> would fail one way one time and another way the next. And it'd even
> >> succeed upon occasion....
> >>
> >> I figured out that my expectations weren't being met. Due to the way
> >> leader queues are sorted, if the two sequence numbers are identical
> >> then the tie-breaker does NOT pick the last node to join at head.  It
> >> picks the one with the lowest (highest? didn't track that down
> >> entirely) session ID. Either way, sometimes it picks the node newly
> >> added at the head and sometimes it picks the old one.
> >>
> >> If I _am_ on the right path, then I propose the following:
> >> 1> I'll raise a new JIRA for leader sequence sorting and take it on.
> >> I'm not quite sure how fix it, the ideas I have are fairly hacky.
> >>
> >> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
> >> things badly and we're too close to 5.0 to try to do anything about
> >> <1> IMO. this just means that I'll comment out the collections API
> >> call in the code and update the ref guide.
> >>
> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
> >> won't be before 5.1
> >>
> >> Erick
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
-----------------------------------------------------
Noble Paul

Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Erick Erickson <er...@gmail.com>.
Thanks! I somewhat remember seeing that conversation but I confess I
didn't follow it that closely.

I can't cope with looking at it any more tonight, but I'll check in
the morning. The problem I see is I don't think there's any way, once
a node is re-inserted in the queue, for another node to figure out
that it's not supposed to be the leader if it's first in line after
the nodes are sorted, but I may have missed that.

Erick

On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet <me...@gmail.com> wrote:
> This is reminiscent of my conversation with Noble on this SOLR-6095 starting
> at this comment:
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
>
> Unfortunately I dropped off following it and my memory is a bit vague right
> now. Reading from the comments, I think Noble had in mind that the
> tie-breaker can pick the wrong node (n2) to be the leader, but then the
> wrong node will then re-initiate the process to renounce leadership and
> re-join (according to
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619).
>
> I then asked about when that renounce process will happen for n2
> (https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659),
> and I'm not sure if that was ever specifically answered. Figuring if and how
> that happens might be key in moving forward?
>
> Jessica
>
> On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson <er...@gmail.com>
> wrote:
>>
>> I'm particularly interested in Noble and Mark's comments...
>>
>> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>>
>> n1 is the leader, n2 watches n1 etc.
>>
>> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
>> watching n1. So far, so good.
>>
>> My expectation is that deleting n1 would cause n3 to become leader,
>> but it isn't at all guaranteed. I have a test case illustrating this.
>>
>> Incidentally, I think I should get the same result by calling
>> retryElection on n1 with joinAtHead=false; n3 should become the
>> leader.
>>
>> I was working on SOLR-6691 and slowly going crazy since everything I
>> was trying would fail. Basically, to rebalance leaders (thanks Noble
>> for pointing out how far off I was in my original approach) it seemed
>> like it would be sufficient to
>>
>> 1> have the preferred leader retry the election at the head
>> 2> tell the old leader to retry at the tail
>>
>> I expected the old node that was watching the leader to figure out
>> that it wasn't really next in line and re-add itself to the end.
>>
>> But things went all to hell in a handbasket when I wrote a harness
>> that exercised it, and it drove me a bit nuts. Especially since it
>> would fail one way one time and another way the next. And it'd even
>> succeed upon occasion....
>>
>> I figured out that my expectations weren't being met. Due to the way
>> leader queues are sorted, if the two sequence numbers are identical
>> then the tie-breaker does NOT pick the last node to join at head.  It
>> picks the one with the lowest (highest? didn't track that down
>> entirely) session ID. Either way, sometimes it picks the node newly
>> added at the head and sometimes it picks the old one.
>>
>> If I _am_ on the right path, then I propose the following:
>> 1> I'll raise a new JIRA for leader sequence sorting and take it on.
>> I'm not quite sure how fix it, the ideas I have are fairly hacky.
>>
>> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
>> things badly and we're too close to 5.0 to try to do anything about
>> <1> IMO. this just means that I'll comment out the collections API
>> call in the code and update the ref guide.
>>
>> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
>> won't be before 5.1
>>
>> Erick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: I think rejoin leader elections at the head isn't doing what it should

Posted by Jessica Mallet <me...@gmail.com>.
This is reminiscent of my conversation with Noble on this SOLR-6095
starting at this comment:
https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386

Unfortunately I dropped off following it and my memory is a bit vague right
now. Reading from the comments, I think Noble had in mind that the
tie-breaker can pick the wrong node (n2) to be the leader, but then the
wrong node will then re-initiate the process to renounce leadership and
re-join (according to
https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619
).

I then asked about when that renounce process will happen for n2 (
https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659),
and I'm not sure if that was ever specifically answered. Figuring if and
how that happens might be key in moving forward?

Jessica

On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson <er...@gmail.com>
wrote:

> I'm particularly interested in Noble and Mark's comments...
>
> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
>
> n1 is the leader, n2 watches n1 etc.
>
> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> watching n1. So far, so good.
>
> My expectation is that deleting n1 would cause n3 to become leader,
> but it isn't at all guaranteed. I have a test case illustrating this.
>
> Incidentally, I think I should get the same result by calling
> retryElection on n1 with joinAtHead=false; n3 should become the
> leader.
>
> I was working on SOLR-6691 and slowly going crazy since everything I
> was trying would fail. Basically, to rebalance leaders (thanks Noble
> for pointing out how far off I was in my original approach) it seemed
> like it would be sufficient to
>
> 1> have the preferred leader retry the election at the head
> 2> tell the old leader to retry at the tail
>
> I expected the old node that was watching the leader to figure out
> that it wasn't really next in line and re-add itself to the end.
>
> But things went all to hell in a handbasket when I wrote a harness
> that exercised it, and it drove me a bit nuts. Especially since it
> would fail one way one time and another way the next. And it'd even
> succeed upon occasion....
>
> I figured out that my expectations weren't being met. Due to the way
> leader queues are sorted, if the two sequence numbers are identical
> then the tie-breaker does NOT pick the last node to join at head.  It
> picks the one with the lowest (highest? didn't track that down
> entirely) session ID. Either way, sometimes it picks the node newly
> added at the head and sometimes it picks the old one.
>
> If I _am_ on the right path, then I propose the following:
> 1> I'll raise a new JIRA for leader sequence sorting and take it on.
> I'm not quite sure how fix it, the ideas I have are fairly hacky.
>
> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
> things badly and we're too close to 5.0 to try to do anything about
> <1> IMO. this just means that I'll comment out the collections API
> call in the code and update the ref guide.
>
> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
> won't be before 5.1
>
> Erick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>