You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2009/07/06 21:19:14 UTC

[jira] Updated: (LUCENE-1609) Eliminate synchronization contention on initial index reading in TermInfosReader ensureIndexIsRead

     [ https://issues.apache.org/jira/browse/LUCENE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1609:
---------------------------------------

    Attachment: LUCENE-1609.patch

Attached patch.  This addresses this issue and LUCENE-1718.

I added 2 new static IndexReader.open expert methods that allow you to
pass in the TermInfos index divisor.  You can pass in -1 to disable
loading of the index entirely (eg, IndexWriter does this when merging
segments).  I also added the param to IndexWriter.getReader, so you
can get an NRT reader w/ subsampled index terms.

This replaces the set/getTermInfosIndexDivisor methods (they are now
deprecated), ie you now must specify the divisor when opening the
reader.  If these methods are called, an UnsupportedOperationException
is thrown.  This is technically a break in back compat (previously you
could call it before the terms index was used, eg if no searches had
been run) but I think we should make an exception here.  Very few
users make use of these expert methods, and having these users switch
to specifying the index divisor up front is a small code change in
exchange for removing all synchronization from the terms dict.

I also made all attrs in TermInfosReader final, and there is no longer
any synchronization.  To handle the case in IndexWriter, where a merge
first opens a segment (which does not need the index) and then an NRT
reader (or, applyDeletes) needs to share the same pooled reader and
needs the terms index, I added a PrivateTermsDict static class to
SegmentReader.  This class just wraps a no-index-loaded
TermInfosReader, which merging will use, and then can open a new
index-is-loaded TermInfosReader when/if needed.


> Eliminate synchronization contention on initial index reading in TermInfosReader ensureIndexIsRead 
> ---------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1609
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.9
>         Environment: Solr 
> Tomcat 5.5
> Ubuntu 2.6.20-17-generic
> Intel(R) Pentium(R) 4 CPU 2.80GHz, 2Gb RAM
>            Reporter: Dan Rosher
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>         Attachments: LUCENE-1609.patch, LUCENE-1609.patch, LUCENE-1609.patch
>
>
> synchronized method ensureIndexIsRead in TermInfosReader causes contention under heavy load
> Simple to reproduce: e.g. Under Solr, with all caches turned off, do a simple range search e.g. id:[0 TO 999999] on even a small index (in my case 28K docs) and under a load/stress test application, and later, examining the Thread dump (kill -3) , many threads are blocked on 'waiting for monitor entry' to this method.
> Rather than using Double-Checked Locking which is known to have issues, this implementation uses a state pattern, where only one thread can move the object from IndexNotRead state to IndexRead, and in doing so alters the objects behavior, i.e. once the index is loaded, the index nolonger needs a synchronized method. 
> In my particular test, this uncreased throughput at least 30 times.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org