You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2016/08/22 20:23:38 UTC

Peer sync and overlapping "windows" heuristic may be causing unnecessary full syncs?

Looking at the peer sync code and I don't quite understand the
condition where we report " Our versions are too old." (about line 498
in PeerySync.java, 6x).

Note that the "in the field" version is 5.3.1, but the code looks the
same in 6.x.

I get that we're testing the overlap between the versions we have and
our peer has. But how was the 20% overlap number arrived at? What is
it intended to guarantee? And in a case where where the requested
number of updates is > the size of the returned list, is it valid to
return true if there is _any_ overlap?


Why do I care? I'm seeing a case in the field where a very large
document exceeds the timeout even though the document successfully
indexes on the follower, it just takes a while.  The Solr node is up
and accepting more updates etc. No updates have actually been missed
AFAICT.

So the leader is telling the follower to sync due to the timeout. The
follower fails the test above and then goes into full sync
unnecessarily. Since this is a very large index this takes a very long
time, strains the system and the problem can cascade.

I'm wondering if this test can be relaxed when the versions list
returned from the peer is smaller than requested to not fail if there
is any overlap. This feels like an incomplete fix though, because I'm
taking it on faith that if the list returned == numRecordsToKeep, then
this test wouldn't be as likely to be tripped. But there's no
guarantee there so a special test in this case would just kick the can
down the road I think.

Can we do a different test perhaps (and I'm really reaching here into
unfamiliar code so this may be all wet)? Let's say the leader gets a
timeout. Would it be possible to rather than do a full peer sync have
the leader ask the follower "Hey, I sent you these versions and you
timed out, do you really have them or not?"? And if the follower was
still processing them not have to do any peer sync at all. Assuming we
could guarantee that the doc was in the replicas tlog when answering,
would that guarantee data integrity?

I can raise a JIRA if any of this makes sense.

Erick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Peer sync and overlapping "windows" heuristic may be causing unnecessary full syncs?

Posted by Erick Erickson <er...@gmail.com>.

Pushkar:

I've been on vacation for a bit, I'll take a look later. Actually,
I'll probably just see if I can piggy-back on Noble's efforts ;).

The "in the field" situation I'm seeing, though, is in 5.x so
fingerprinting isn't in place so likely the real fix will require an
upgrade.

Thanks for pointing that JIRA out!

Best,
Erick

On Fri, Sep 2, 2016 at 7:37 AM, Pushkar Raste <pu...@gmail.com> wrote:
> Erick,
>
> If your updates are infrequent, you may benefit from a patch for
> https://issues.apache.org/jira/browse/SOLR-9446
> It seems like recovery always tries PeerSync first and would fall back on to
> replication if PeerSync fails.
>
> My fix for a different use case was to node trying PeerSync first checks
> fingerprint for the max version it has matches to the fingerprint of max
> version leader has. If checks out, PeerSync simply returns true.
>
> There may be a better way to do this. However, for you use case if updates
> are not that frequent nodes will not go into replication recovery.
>
> On Mon, Aug 22, 2016 at 4:23 PM, Erick Erickson <er...@gmail.com>
> wrote:
>>
>> Looking at the peer sync code and I don't quite understand the
>> condition where we report " Our versions are too old." (about line 498
>> in PeerySync.java, 6x).
>>
>> Note that the "in the field" version is 5.3.1, but the code looks the
>> same in 6.x.
>>
>> I get that we're testing the overlap between the versions we have and
>> our peer has. But how was the 20% overlap number arrived at? What is
>> it intended to guarantee? And in a case where where the requested
>> number of updates is > the size of the returned list, is it valid to
>> return true if there is _any_ overlap?
>>
>>
>> Why do I care? I'm seeing a case in the field where a very large
>> document exceeds the timeout even though the document successfully
>> indexes on the follower, it just takes a while.  The Solr node is up
>> and accepting more updates etc. No updates have actually been missed
>> AFAICT.
>>
>> So the leader is telling the follower to sync due to the timeout. The
>> follower fails the test above and then goes into full sync
>> unnecessarily. Since this is a very large index this takes a very long
>> time, strains the system and the problem can cascade.
>>
>> I'm wondering if this test can be relaxed when the versions list
>> returned from the peer is smaller than requested to not fail if there
>> is any overlap. This feels like an incomplete fix though, because I'm
>> taking it on faith that if the list returned == numRecordsToKeep, then
>> this test wouldn't be as likely to be tripped. But there's no
>> guarantee there so a special test in this case would just kick the can
>> down the road I think.
>>
>> Can we do a different test perhaps (and I'm really reaching here into
>> unfamiliar code so this may be all wet)? Let's say the leader gets a
>> timeout. Would it be possible to rather than do a full peer sync have
>> the leader ask the follower "Hey, I sent you these versions and you
>> timed out, do you really have them or not?"? And if the follower was
>> still processing them not have to do any peer sync at all. Assuming we
>> could guarantee that the doc was in the replicas tlog when answering,
>> would that guarantee data integrity?
>>
>> I can raise a JIRA if any of this makes sense.
>>
>> Erick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Peer sync and overlapping "windows" heuristic may be causing unnecessary full syncs?

Posted by Pushkar Raste <pu...@gmail.com>.

Erick,

If your updates are infrequent, you may benefit from a patch for
https://issues.apache.org/jira/browse/SOLR-9446
It seems like recovery always tries PeerSync first and would fall back on
to replication if PeerSync fails.

My fix for a different use case was to node trying PeerSync first checks
fingerprint for the max version it has matches to the fingerprint of max
version leader has. If checks out, PeerSync simply returns true.

There may be a better way to do this. However, for you use case if updates
are not that frequent nodes will not go into replication recovery.

On Mon, Aug 22, 2016 at 4:23 PM, Erick Erickson <er...@gmail.com>
wrote:

> Looking at the peer sync code and I don't quite understand the
> condition where we report " Our versions are too old." (about line 498
> in PeerySync.java, 6x).
>
> Note that the "in the field" version is 5.3.1, but the code looks the
> same in 6.x.
>
> I get that we're testing the overlap between the versions we have and
> our peer has. But how was the 20% overlap number arrived at? What is
> it intended to guarantee? And in a case where where the requested
> number of updates is > the size of the returned list, is it valid to
> return true if there is _any_ overlap?
>
>
> Why do I care? I'm seeing a case in the field where a very large
> document exceeds the timeout even though the document successfully
> indexes on the follower, it just takes a while.  The Solr node is up
> and accepting more updates etc. No updates have actually been missed
> AFAICT.
>
> So the leader is telling the follower to sync due to the timeout. The
> follower fails the test above and then goes into full sync
> unnecessarily. Since this is a very large index this takes a very long
> time, strains the system and the problem can cascade.
>
> I'm wondering if this test can be relaxed when the versions list
> returned from the peer is smaller than requested to not fail if there
> is any overlap. This feels like an incomplete fix though, because I'm
> taking it on faith that if the list returned == numRecordsToKeep, then
> this test wouldn't be as likely to be tripped. But there's no
> guarantee there so a special test in this case would just kick the can
> down the road I think.
>
> Can we do a different test perhaps (and I'm really reaching here into
> unfamiliar code so this may be all wet)? Let's say the leader gets a
> timeout. Would it be possible to rather than do a full peer sync have
> the leader ask the follower "Hey, I sent you these versions and you
> timed out, do you really have them or not?"? And if the follower was
> still processing them not have to do any peer sync at all. Assuming we
> could guarantee that the doc was in the replicas tlog when answering,
> would that guarantee data integrity?
>
> I can raise a JIRA if any of this makes sense.
>
> Erick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>