You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Mark Miller (JIRA)" <ji...@apache.org> on 2016/12/07 20:01:59 UTC

[jira] [Commented] (SOLR-9836) Add more graceful recovery steps when failing to create SolrCore

    [ https://issues.apache.org/jira/browse/SOLR-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15729763#comment-15729763 ] 

Mark Miller commented on SOLR-9836:
-----------------------------------

We probably want to be able to turn it off. Some users may want to ability to use check index and try to salvage what they can in corruption cases.

I'm not sure that is the right exception to catch - very brittle. We should probably be mostly looking for CorruptedIndexException and if that doesn't cover a case at the Lucene level, look at improving that there. Even if the case of a 0 byte segments file with nothing to roll back on throws an EOFException today, it may not tomorrow. I think that is the goal of the CorruptIndexException - you can actually have a little more than momentary confidence that your code is not treating exceptions one way while things change underneath you over time.

bq. Would it be safe to add directoryFactory.doneWithDirectory() to modifyIndexProps
directoryFactory.doneWithDirectory is for the case where you are done with the directory and it can now be deleted if need be - you won't access it again.

bq. Should modifyIndexProps stay in IndexFetcher or move somewhere more generic?

I have not looked yet, but may make more sense in SolrCore or something.

> Add more graceful recovery steps when failing to create SolrCore
> ----------------------------------------------------------------
>
>                 Key: SOLR-9836
>                 URL: https://issues.apache.org/jira/browse/SOLR-9836
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Mike Drob
>         Attachments: SOLR-9836.patch
>
>
> I have seen several cases where there is a zero-length segments_n file. We haven't identified the root cause of these issues (possibly a poorly timed crash during replication?) but if there is another node available then Solr should be able to recover from this situation. Currently, we log and give up on loading that core, leaving the user to manually intervene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org