You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Guy Moshkowich <GU...@il.ibm.com> on 2015/06/09 18:23:32 UTC

CheckIndex failed for Solr 4.7.2 index

We are using Solr 4.7.2 and we found that when we run 
CheckIndex.checkIndex on one of the Solr shards we are getting the error 
below.
Both replicas of the shard had the same error.
The shard index looked healthy:
1) It appeared active in the Solr admin page.
2) We could run searches against it.
3) No relevant errors where found in Solr logs.
4) After we optimized the index in LUKE, CheckIndex did not report any 
error.

My questions:
1) Is this is a real issue or a known bug in CheckIndex code that cause 
false negative ?
2) Is there a known fix for this issue?

Here is the error we got:
 validateIndex Segments file=segments_bhe numSegments=15 version=4.7 
format= userData={commitTimeMSec=1432689607801}
  1 of 15: name=_6cth docCount=248744
    codec=Lucene46
    compound=false
    numFiles=11
    size (MB)=86.542
    diagnostics = {timestamp=1428883354605, os=Linux, 
os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge, 
lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64, 
mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation}
    has deletions [delGen=3174]
    test: open reader.........FAILED
    WARNING: fixIndex() would remove reference to this segment; full 
exception:
java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs 
bits=156872
    at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581)
    at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)

Appreciate yout help,
Guy.

Re: CheckIndex failed for Solr 4.7.2 index

Posted by Michael McCandless <lu...@mikemccandless.com>.
IBM's J9 JVM unfortunately still has a number of nasty bugs affecting
Lucene; most likely you are hitting one of these.  We used to test J9
in our continuous Jenkins jobs, but there were just too many
J9-specific failures and we couldn't get IBM's attention to resolve
them, so we stopped.  For now you should switch to Oracle JDK, or
OpenJDK.

But there's some good news!  Recently, a member from the IBM JDK team
replied to this Elasticsearch thread:
https://discuss.elastic.co/t/need-help-with-ibm-jdk-issues-with-es-1-4-5/1748/3

And then Robert Muir ran Lucene's tests with the latest J9 and opened
several issues; see the 2nd bullet under Apache Lucene at
https://www.elastic.co/blog/this-week-in-elasticsearch-and-apache-lucene-2015-06-09
and at least one of the issues seems to be making progress
(https://issues.apache.org/jira/browse/LUCENE-6522).

So there is hope for the future, but for today it's too dangerous to
use J9 with Lucene/Solr/Elasticsearch.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jun 9, 2015 at 12:23 PM, Guy Moshkowich <GU...@il.ibm.com> wrote:
> We are using Solr 4.7.2 and we found that when we run
> CheckIndex.checkIndex on one of the Solr shards we are getting the error
> below.
> Both replicas of the shard had the same error.
> The shard index looked healthy:
> 1) It appeared active in the Solr admin page.
> 2) We could run searches against it.
> 3) No relevant errors where found in Solr logs.
> 4) After we optimized the index in LUKE, CheckIndex did not report any
> error.
>
> My questions:
> 1) Is this is a real issue or a known bug in CheckIndex code that cause
> false negative ?
> 2) Is there a known fix for this issue?
>
> Here is the error we got:
>  validateIndex Segments file=segments_bhe numSegments=15 version=4.7
> format= userData={commitTimeMSec=1432689607801}
>   1 of 15: name=_6cth docCount=248744
>     codec=Lucene46
>     compound=false
>     numFiles=11
>     size (MB)=86.542
>     diagnostics = {timestamp=1428883354605, os=Linux,
> os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge,
> lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64,
> mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation}
>     has deletions [delGen=3174]
>     test: open reader.........FAILED
>     WARNING: fixIndex() would remove reference to this segment; full
> exception:
> java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs
> bits=156872
>     at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581)
>     at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)
>
> Appreciate yout help,
> Guy.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: CheckIndex failed for Solr 4.7.2 index

Posted by Michael McCandless <lu...@mikemccandless.com>.
IBM's J9 JVM unfortunately still has a number of nasty bugs affecting
Lucene; most likely you are hitting one of these.  We used to test J9
in our continuous Jenkins jobs, but there were just too many
J9-specific failures and we couldn't get IBM's attention to resolve
them, so we stopped.  For now you should switch to Oracle JDK, or
OpenJDK.

But there's some good news!  Recently, a member from the IBM JDK team
replied to this Elasticsearch thread:
https://discuss.elastic.co/t/need-help-with-ibm-jdk-issues-with-es-1-4-5/1748/3

And then Robert Muir ran Lucene's tests with the latest J9 and opened
several issues; see the 2nd bullet under Apache Lucene at
https://www.elastic.co/blog/this-week-in-elasticsearch-and-apache-lucene-2015-06-09
and at least one of the issues seems to be making progress
(https://issues.apache.org/jira/browse/LUCENE-6522).

So there is hope for the future, but for today it's too dangerous to
use J9 with Lucene/Solr/Elasticsearch.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jun 9, 2015 at 12:23 PM, Guy Moshkowich <GU...@il.ibm.com> wrote:
> We are using Solr 4.7.2 and we found that when we run
> CheckIndex.checkIndex on one of the Solr shards we are getting the error
> below.
> Both replicas of the shard had the same error.
> The shard index looked healthy:
> 1) It appeared active in the Solr admin page.
> 2) We could run searches against it.
> 3) No relevant errors where found in Solr logs.
> 4) After we optimized the index in LUKE, CheckIndex did not report any
> error.
>
> My questions:
> 1) Is this is a real issue or a known bug in CheckIndex code that cause
> false negative ?
> 2) Is there a known fix for this issue?
>
> Here is the error we got:
>  validateIndex Segments file=segments_bhe numSegments=15 version=4.7
> format= userData={commitTimeMSec=1432689607801}
>   1 of 15: name=_6cth docCount=248744
>     codec=Lucene46
>     compound=false
>     numFiles=11
>     size (MB)=86.542
>     diagnostics = {timestamp=1428883354605, os=Linux,
> os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge,
> lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64,
> mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation}
>     has deletions [delGen=3174]
>     test: open reader.........FAILED
>     WARNING: fixIndex() would remove reference to this segment; full
> exception:
> java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs
> bits=156872
>     at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581)
>     at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)
>
> Appreciate yout help,
> Guy.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: CheckIndex failed for Solr 4.7.2 index

Posted by Michael McCandless <lu...@mikemccandless.com>.
IBM's J9 JVM unfortunately still has a number of nasty bugs affecting
Lucene; most likely you are hitting one of these.  We used to test J9
in our continuous Jenkins jobs, but there were just too many
J9-specific failures and we couldn't get IBM's attention to resolve
them, so we stopped.  For now you should switch to Oracle JDK, or
OpenJDK.

But there's some good news!  Recently, a member from the IBM JDK team
replied to this Elasticsearch thread:
https://discuss.elastic.co/t/need-help-with-ibm-jdk-issues-with-es-1-4-5/1748/3

And then Robert Muir ran Lucene's tests with the latest J9 and opened
several issues; see the 2nd bullet under Apache Lucene at
https://www.elastic.co/blog/this-week-in-elasticsearch-and-apache-lucene-2015-06-09
and at least one of the issues seems to be making progress
(https://issues.apache.org/jira/browse/LUCENE-6522).

So there is hope for the future, but for today it's too dangerous to
use J9 with Lucene/Solr/Elasticsearch.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jun 9, 2015 at 12:23 PM, Guy Moshkowich <GU...@il.ibm.com> wrote:
> We are using Solr 4.7.2 and we found that when we run
> CheckIndex.checkIndex on one of the Solr shards we are getting the error
> below.
> Both replicas of the shard had the same error.
> The shard index looked healthy:
> 1) It appeared active in the Solr admin page.
> 2) We could run searches against it.
> 3) No relevant errors where found in Solr logs.
> 4) After we optimized the index in LUKE, CheckIndex did not report any
> error.
>
> My questions:
> 1) Is this is a real issue or a known bug in CheckIndex code that cause
> false negative ?
> 2) Is there a known fix for this issue?
>
> Here is the error we got:
>  validateIndex Segments file=segments_bhe numSegments=15 version=4.7
> format= userData={commitTimeMSec=1432689607801}
>   1 of 15: name=_6cth docCount=248744
>     codec=Lucene46
>     compound=false
>     numFiles=11
>     size (MB)=86.542
>     diagnostics = {timestamp=1428883354605, os=Linux,
> os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge,
> lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64,
> mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation}
>     has deletions [delGen=3174]
>     test: open reader.........FAILED
>     WARNING: fixIndex() would remove reference to this segment; full
> exception:
> java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs
> bits=156872
>     at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581)
>     at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)
>
> Appreciate yout help,
> Guy.