You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by xu cheng <xc...@gmail.com> on 2010/12/29 06:49:00 UTC

any issues about the *perthread classes

hi all
I noticed that there are plenty *PerThread classes in the trunk
http://svn.apache.org/repos/asf/lucene/dev/trunk/
while in the realtime_search version
http://svn.apache.org/repos/asf/lucene/dev/branches/realtime_search/
the *PerThread classes are gone!
this just confused me,  cos I'm new here.

what's the purpose of such a design?what's the advantage? any issues refer
to this ??

any suggestion or references are appreciated!
regards.
xu

Re: any issues about the *perthread classes

Posted by Michael McCandless <lu...@mikemccandless.com>.
Basically, we are moving the thread state "upwards" in Lucene's indexing chain.

Ie, very early on when indexing a doc you pick a thread-private state.
 Then, the thread does all indexing into this private state,
unfettered by any sync blocks.

This is akin to moving to a process-based concurrency model, ie, we
are most strongly separating threads to limit the number of locks that
must be acquired when indexing a doc, or when flushing.

This is an important change because it means flushing of a single
thread private state can take place concurrently with ongoing indexing
into other thread states.  Lucene cannot do this today since flushing
flushes all thread states, and it results in a serious bottleneck on
indexing throughput for machines w/ alot of available concurrency.  I
wrote about this problem here:

    http://chbits.blogspot.com/2010/09/lucenes-indexing-is-fast.html

The takeaway is that using 6 indexing threads means we are blocked 50%
of the time waiting for flush, which is quite awful.  This was on a
machine w/ an SSD and 24 cores, so, Lucene was nowhere near able to
take advantage of this machine's concurrency.  Once flushing is
concurrent we should be able to fully saturate both IO and CPU
concurrency on such a machine...

Mike

On Wed, Dec 29, 2010 at 12:49 AM, xu cheng <xc...@gmail.com> wrote:
> hi all
> I noticed that there are plenty *PerThread classes in the
> trunk http://svn.apache.org/repos/asf/lucene/dev/trunk/
> while in the realtime_search
> version http://svn.apache.org/repos/asf/lucene/dev/branches/realtime_search/
> the *PerThread classes are gone!
> this just confused me,  cos I'm new here.
> what's the purpose of such a design?what's the advantage? any issues refer
> to this ??
> any suggestion or references are appreciated!
> regards.
> xu

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org