You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Christine Poerschke (Jira)" <ji...@apache.org> on 2019/10/01 13:21:00 UTC

[jira] [Commented] (LUCENE-8961) CheckIndex: pre-exorcise document id salvage

    [ https://issues.apache.org/jira/browse/LUCENE-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941823#comment-16941823 ] 

Christine Poerschke commented on LUCENE-8961:
---------------------------------------------

{quote}... When I said "on top of CheckIndex", I was rather thinking of running CheckIndex programmatically and then looking at the return value to understand what segments might need salvaging. ...
{quote}
Ah, thanks for clarifying that!

Okay, let me take this opportunity then to jot down some code pointers for when this is being returned to in the future:
 * A [CheckIndex|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java#L397-L400] object can be created programmatically and the [checkIndex|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java#L488-L499] method be called on it.
 * The method return value is a [Status|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java#L93-L99] object which currently includes (amongst other things) the number of bad segments.
 * If the checkIndex returned also e.g. the segments which are bad then that information could be fed into a stand-alone document id salvage tool.
 * Advantages of the stand-alone-ness of the tool would be that it can be very clear that caveats apply i.e. it might not be always possible to correctly salvage values and it could also be very clear that the tool is a read-only tool (whereas {{CheckIndex}} can be read-only as well as read-write).

> CheckIndex: pre-exorcise document id salvage
> --------------------------------------------
>
>                 Key: LUCENE-8961
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8961
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Christine Poerschke
>            Priority: Minor
>         Attachments: LUCENE-8961.patch, LUCENE-8961.patch
>
>
> The [CheckIndex|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java] tool supports the exorcising of corrupt segments from an index.
> This ticket proposes to add an extra option which could first be used to potentially salvage the document ids of the segment(s) about to be exorcised. Re-ingestion for those documents could then be arranged so as to repair the data damage caused by the exorcising.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org