You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shawn Heisey (JIRA)" <ji...@apache.org> on 2015/03/23 13:41:11 UTC

[jira] [Commented] (SOLR-7292) OutOfMemory happened in Solr, but /clusterstates.json shows cores "active"

    [ https://issues.apache.org/jira/browse/SOLR-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375854#comment-14375854 ] 

Shawn Heisey commented on SOLR-7292:
------------------------------------

I hope that we can improve Solr to deal with the problem you found, but there's something important to say:  The results of OOM errors are, by their very nature, extremely hard to predict.

It *is* possible to force a program into predictable behavior on OOM, but it is a difficult task to accomplish.  Such programming *has* been done in parts of the Lucene codebase, specifically those parts that write the index files, so that OOM will not produce a corrupt index.  The "no corruption" guarantee is the only one that Lucene attempts to make ... in all other ways, the behavior of a Lucene program (like Solr) is not predictable in the face of OOM.

Your best bet is to ensure that Java runs an OOM script that will kill Solr entirely when OOM happens.  This is handled automatically by the Solr 5.0 start script.


> OutOfMemory happened in Solr, but /clusterstates.json shows cores "active"
> --------------------------------------------------------------------------
>
>                 Key: SOLR-7292
>                 URL: https://issues.apache.org/jira/browse/SOLR-7292
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - Clustering
>    Affects Versions: 4.7
>         Environment: Redhat Linux 6.3 64bit
>            Reporter: Forest Soup
>              Labels: performance
>         Attachments: OOM.txt, failure.txt
>
>
> One of our 5 Solr server got OOM, but in /clusterstates.json in ZK, it is still "active".  The OOM Ex are the attached OOM.txt.
> But update and commit to the collection which has cores on that Solr server will got failure. The logs are in the failure.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org