You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Yatir Ben Shlomo <ya...@outbrain.com> on 2010/07/22 16:54:13 UTC
Question to the writer of MultiPassIndexSplitter
Hi,
I heard work is being done on re-writing MultiPassIndexSplitter so it will be a single pass and work quicker.
I was wondering if this is already done or when is it due ?
Thanks
RE: Question to the writer of MultiPassIndexSplitter
Posted by "Burton-West, Tom" <tb...@umich.edu>.
The work on MultiPassIndexSplitter is being done by Andrzej Bialecki, the creator of Luke.
See http://lucene-eurocon.org/sessions-track1-day1.html#3
http://lucene-eurocon.org/slides/Munching-&-crunching-Lucene-index-post-processing-and-applications_Andrzej-Bialecki.pdf
The slides say "SinglePassSplitter work started, to be contributed soon."
You might try asking him directly or posting to the java-dev list.
Tom
www.hathitrust.org/blogs
-----Original Message-----
From: Christopher Condit [mailto:condit@sdsc.edu]
Sent: Thursday, August 05, 2010 12:08 PM
To: Yatir Ben Shlomo
Cc: java-user@lucene.apache.org
Subject: RE: Question to the writer of MultiPassIndexSplitter
> > > I heard work is being done on re-writing MultiPassIndexSplitter so it
> > > will be a single pass and work quicker.
> > Because that was so slow I just wrote a utility class to create a list of N
> > IndexWriters and round robin documents to them as the index is created.
> > Then we use a ParallelMultiSearcher for retrieval. I can send you the code if
> > you're interested...
> Yes it will be great if you can send me this code..
Here's some code: http://pastie.org/1077591
We re-index everything offline from scratch. You'll need to modify the code to support reopening and updating documents if that's a requirement...
-Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: Question to the writer of MultiPassIndexSplitter
Posted by Christopher Condit <co...@sdsc.edu>.
> > > I heard work is being done on re-writing MultiPassIndexSplitter so it
> > > will be a single pass and work quicker.
> > Because that was so slow I just wrote a utility class to create a list of N
> > IndexWriters and round robin documents to them as the index is created.
> > Then we use a ParallelMultiSearcher for retrieval. I can send you the code if
> > you're interested...
> Yes it will be great if you can send me this code..
Here's some code: http://pastie.org/1077591
We re-index everything offline from scratch. You'll need to modify the code to support reopening and updating documents if that's a requirement...
-Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: Question to the writer of MultiPassIndexSplitter
Posted by Christopher Condit <co...@sdsc.edu>.
> I heard work is being done on re-writing MultiPassIndexSplitter so it will be a
> single pass and work quicker.
Because that was so slow I just wrote a utility class to create a list of N IndexWriters and round robin documents to them as the index is created. Then we use a ParallelMultiSearcher for retrieval. I can send you the code if you're interested...
-Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org