You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/04/22 23:33:59 UTC

[jira] [Comment Edited] (SOLR-7332) Seed version buckets with max version from index

    [ https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507943#comment-14507943 ] 

Yonik Seeley edited comment on SOLR-7332 at 4/22/15 9:33 PM:
-------------------------------------------------------------

bq. Next, I tried increasing the number of reducers I was using to see how hard I could push Solr and unfortunately, I ended up with 2 shards that had replicas that were out-of-sync with their leader. 

Were there any recoveries or change of leaders during the run?
In a way, this is great that you saw this!  Only new adds should significantly narrow what this could be.  Hopefully you'll be able to reproduce.

bq. can you think of a case where docs could be dropped with this new version bucket seeding stuff?

No... if we accidentally set the version too high, there are no correctness issues, just extra checks.
If we accidentally set the version too low, then we can fail to drop repeated or reordered updates.  But in your test run, this shouldn't be an issue since it's only adds.  Any old repeats won't change the number of docs (and which docs) are in the index.

edit: additionally, it can't be SOLR-7347 since that requires updates to the same document(s)


was (Author: yseeley@gmail.com):
bq. Next, I tried increasing the number of reducers I was using to see how hard I could push Solr and unfortunately, I ended up with 2 shards that had replicas that were out-of-sync with their leader. 

Were there any recoveries or change of leaders during the run?
In a way, this is great that you saw this!  Only new adds should significantly narrow what this could be.  Hopefully you'll be able to reproduce.

bq. can you think of a case where docs could be dropped with this new version bucket seeding stuff?

No... if we accidentally set the version too high, there are no correctness issues, just extra checks.
If we accidentally set the version too low, then we can fail to drop repeated or reordered updates.  But in your test run, this shouldn't be an issue since it's only adds.  Any old repeats won't change the number of docs (and which docs) are in the index.


> Seed version buckets with max version from index
> ------------------------------------------------
>
>                 Key: SOLR-7332
>                 URL: https://issues.apache.org/jira/browse/SOLR-7332
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each version bucket to the MAX value of the {{__version__}} field in the index as early as possible, such as after the first soft- or hard- commit. This will ensure that bulk adds where the docs don't exist avoid an unnecessary lookup for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org