You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2017/05/18 12:24:04 UTC

[jira] [Updated] (LUCENE-7837) Use indexCreatedVersionMajor to fail opening too old indices

     [ https://issues.apache.org/jira/browse/LUCENE-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand updated LUCENE-7837:
---------------------------------
    Attachment: LUCENE-7837.patch

The attached patch shows what it would look like. We should do this as of 8.0 which will the first major version whose previous major version also records the index creation version.

> Use indexCreatedVersionMajor to fail opening too old indices
> ------------------------------------------------------------
>
>                 Key: LUCENE-7837
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7837
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7837.patch
>
>
> Even though in theory we only support reading indices created with version N or N-1, in practice it is possible to run a forceMerge in order to make Lucene accept to open the index since we only record the version that wrote segments and commit points. However as of Lucene 7.0, we also record the major version that was used to initially create the index, meaning we could also fail to open N-2 indices that have only been merged with version N-1.
> The current state of things where we could read old data without knowing it raises issues with everything that is performed on top of the codec API such as analysis, input validation or norms encoding, especially now that we plan to change the defaults (LUCENE-7730).
> For instance, we are only starting to reject broken offsets in term vectors in Lucene 7. If we do not enforce the index to be created with either Lucene 7 or 8 once we move to Lucene 8, then it means codecs could still be fed with broken offsets, which is a pity since assuming that offsets go forward makes things easier to encode and also potentially allows for better compression.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org