You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/04/01 22:46:53 UTC

[jira] [Commented] (SOLR-7332) Seed version buckets with max version from index

    [ https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391417#comment-14391417 ] 

Yonik Seeley commented on SOLR-7332:
------------------------------------

Wow, nice results!

We still have some correctness issues though (but fixing them should not impact performance).
VersionBucket.highest must be greater or equal to the highest version encountered for that bucket.
Having a bucket value too high will only cause a few unnecessary lookups, but having a value too low will cause correctness issues.

Primary issue: commits happen concurrently with other updates, so attempting to initialize bucket versions from an "active" index will mean that we will sometimes set a version too low. "As coded, this doesn't happen until the first soft- or hard- commit is triggered" sounds like the index will indeed be active.

Secondary issue: there is a race in setting the bucket version.  VersionBucket.updateHighest will be called with the monitor held, but between the test and the set, VersionInfo.seedBucketVersionHighestFromIndex can change "highest".  Or vice-versa.

> Seed version buckets with max version from index
> ------------------------------------------------
>
>                 Key: SOLR-7332
>                 URL: https://issues.apache.org/jira/browse/SOLR-7332
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each version bucket to the MAX value of the {{__version__}} field in the index as early as possible, such as after the first soft- or hard- commit. This will ensure that bulk adds where the docs don't exist avoid an unnecessary lookup for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org