You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Paul Smith (JIRA)" <ji...@apache.org> on 2008/05/12 00:41:55 UTC
[jira] Commented: (LUCENE-1282) Sun hotspot compiler bug in
1.6.0_04/05 affects Lucene
[ https://issues.apache.org/jira/browse/LUCENE-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595946#action_12595946 ]
Paul Smith commented on LUCENE-1282:
------------------------------------
Another workaround might be to use '-client' instead of the default '-server' (for server class machines). This affects a few things, not least this switch:
-XX:CompileThreshold=10000 Number of method invocations/branches before compiling [-client: 1,500]
-server implies a 10000 value. I have personally observed similar behaviour like problems like the above with -server, and usually -client ends up 'solving' them.
I'm sure there was also a way to mark a method to not jit compile too (rather than resort to -Xint which disables i for everything), but now I cant' find what that syntax is at all.
> Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene
> ------------------------------------------------------
>
> Key: LUCENE-1282
> URL: https://issues.apache.org/jira/browse/LUCENE-1282
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3, 2.3.1
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.4
>
>
> This is not a Lucene bug. It's an as-yet not fully characterized Sun
> JRE bug, as best I can tell. I'm opening this to gather all things we
> know, and to work around it in Lucene if possible, and maybe open an
> issue with Sun if we can reduce it to a compact test case.
> It's hit at least 3 users:
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/200803.mbox/%3c8c4e68610803180438x39737565q9f97b4802ed774a5@mail.gmail.com%3e
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200804.mbox/%3c4807654E.7050900@virginia.edu%3e
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/200805.mbox/%3c733777220805060156t7fdb8fectf0bc984fbfe48a22@mail.gmail.com%3e
> It's specific to at least JRE 1.6.0_04 and 1.6.0_05, that affects
> Lucene. Whereas 1.6.0_03 works OK and it's unknown whether 1.6.0_06
> shows it.
> The bug affects bulk merging of stored fields. When it strikes, the
> segment produced by a merge is corrupt because its fdx file (stored
> fields index file) is missing one document. After iterating many
> times with the first user that hit this, adding diagnostics &
> assertions, its seems that a call to fieldsWriter.addDocument some
> either fails to run entirely, or, fails to invoke its call to
> indexStream.writeLong. It's as if when hotspot compiles a method,
> there's some sort of race condition in cutting over to the compiled
> code whereby a single method call fails to be invoked (speculation).
> Unfortunately, this corruption is silent when it occurs and only later
> detected when a merge tries to merge the bad segment, or an
> IndexReader tries to open it. Here's a typical merge exception:
> {code}
> Exception in thread "Thread-10"
> org.apache.lucene.index.MergePolicy$MergeException:
> org.apache.lucene.index.CorruptIndexException:
> doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows 16000
> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:271)
> Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows 16000
> at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:221)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3099)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
> {code}
> and here's a typical exception hit when opening a searcher:
> {code}
> org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _kk: fieldsReader shows 72670 but segmentInfo shows 72671
> at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:230)
> at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:73)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636)
> at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
> at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:48)
> {code}
> Sometimes, adding -Xbatch (forces up front compilation) or -Xint
> (disables compilation) to the java command line works around the
> issue.
> Here are some of the OS's we've seen the failure on:
> {code}
> SuSE 10.0
> Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005 x86_64
> x86_64 x86_64 GNU/Linux
> SuSE 8.2
> Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003 i686
> unknown unknown GNU/Linux
> Red Hat Enterprise Linux Server release 5.1 (Tikanga)
> Linux lab8.betech.virginia.edu 2.6.18-53.1.14.el5 #1 SMP Tue Feb 19
> 07:18:21 EST 2008 i686 i686 i386 GNU/Linux
> {code}
> I've already added assertions to Lucene to detect when this bug
> strikes, but since assertions are not usually enabled, I plan to add a
> real check to catch when this bug strikes *before* we commit the merge
> to the index. This way we can detect & quarantine the failure and
> prevent corruption from entering the index.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: [jira] Commented: (LUCENE-1282) Sun hotspot compiler bug in
1.6.0_04/05 affects Lucene
Posted by Mark Miller <ma...@gmail.com>.
>>From what I read -Xint slows you down so much its not much of a
workaround.
Here's a couple examples of that exclude method syntax (had to use it
recently with eclipse):
-XX:CompileCommand=exclude,org/apache/lucene/index/IndexReader\
$1,doBody
-XX:CompileCommand=exclude,org/eclipse/core/internal/dtree/DataTreeNode,forwardDeltaWith
On Sun, 2008-05-11 at 15:41 -0700, Paul Smith (JIRA) wrote:
> [ https://issues.apache.org/jira/browse/LUCENE-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595946#action_12595946 ]
>
> Paul Smith commented on LUCENE-1282:
> ------------------------------------
>
> Another workaround might be to use '-client' instead of the default '-server' (for server class machines). This affects a few things, not least this switch:
>
> -XX:CompileThreshold=10000 Number of method invocations/branches before compiling [-client: 1,500]
>
> -server implies a 10000 value. I have personally observed similar behaviour like problems like the above with -server, and usually -client ends up 'solving' them.
>
> I'm sure there was also a way to mark a method to not jit compile too (rather than resort to -Xint which disables i for everything), but now I cant' find what that syntax is at all.
>
> > Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene
> > ------------------------------------------------------
> >
> > Key: LUCENE-1282
> > URL: https://issues.apache.org/jira/browse/LUCENE-1282
> > Project: Lucene - Java
> > Issue Type: Bug
> > Components: Index
> > Affects Versions: 2.3, 2.3.1
> > Reporter: Michael McCandless
> > Assignee: Michael McCandless
> > Priority: Minor
> > Fix For: 2.4
> >
> >
> > This is not a Lucene bug. It's an as-yet not fully characterized Sun
> > JRE bug, as best I can tell. I'm opening this to gather all things we
> > know, and to work around it in Lucene if possible, and maybe open an
> > issue with Sun if we can reduce it to a compact test case.
> > It's hit at least 3 users:
> > http://mail-archives.apache.org/mod_mbox/lucene-java-user/200803.mbox/%3c8c4e68610803180438x39737565q9f97b4802ed774a5@mail.gmail.com%3e
> > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200804.mbox/%3c4807654E.7050900@virginia.edu%3e
> > http://mail-archives.apache.org/mod_mbox/lucene-java-user/200805.mbox/%3c733777220805060156t7fdb8fectf0bc984fbfe48a22@mail.gmail.com%3e
> > It's specific to at least JRE 1.6.0_04 and 1.6.0_05, that affects
> > Lucene. Whereas 1.6.0_03 works OK and it's unknown whether 1.6.0_06
> > shows it.
> > The bug affects bulk merging of stored fields. When it strikes, the
> > segment produced by a merge is corrupt because its fdx file (stored
> > fields index file) is missing one document. After iterating many
> > times with the first user that hit this, adding diagnostics &
> > assertions, its seems that a call to fieldsWriter.addDocument some
> > either fails to run entirely, or, fails to invoke its call to
> > indexStream.writeLong. It's as if when hotspot compiles a method,
> > there's some sort of race condition in cutting over to the compiled
> > code whereby a single method call fails to be invoked (speculation).
> > Unfortunately, this corruption is silent when it occurs and only later
> > detected when a merge tries to merge the bad segment, or an
> > IndexReader tries to open it. Here's a typical merge exception:
> > {code}
> > Exception in thread "Thread-10"
> > org.apache.lucene.index.MergePolicy$MergeException:
> > org.apache.lucene.index.CorruptIndexException:
> > doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows 16000
> > at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:271)
> > Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows 16000
> > at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
> > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
> > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:221)
> > at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3099)
> > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
> > at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
> > {code}
> > and here's a typical exception hit when opening a searcher:
> > {code}
> > org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _kk: fieldsReader shows 72670 but segmentInfo shows 72671
> > at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
> > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
> > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:230)
> > at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:73)
> > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636)
> > at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)
> > at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
> > at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
> > at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:48)
> > {code}
> > Sometimes, adding -Xbatch (forces up front compilation) or -Xint
> > (disables compilation) to the java command line works around the
> > issue.
> > Here are some of the OS's we've seen the failure on:
> > {code}
> > SuSE 10.0
> > Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005 x86_64
> > x86_64 x86_64 GNU/Linux
> > SuSE 8.2
> > Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003 i686
> > unknown unknown GNU/Linux
> > Red Hat Enterprise Linux Server release 5.1 (Tikanga)
> > Linux lab8.betech.virginia.edu 2.6.18-53.1.14.el5 #1 SMP Tue Feb 19
> > 07:18:21 EST 2008 i686 i686 i386 GNU/Linux
> > {code}
> > I've already added assertions to Lucene to detect when this bug
> > strikes, but since assertions are not usually enabled, I plan to add a
> > real check to catch when this bug strikes *before* we commit the merge
> > to the index. This way we can detect & quarantine the failure and
> > prevent corruption from entering the index.
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org