You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ndapa Nakashole <nd...@gmail.com> on 2007/07/04 14:31:16 UTC

Index partitioning by term

I am considering using Lucene in my mini Grid-based search engine. I would
like to partition my index by term as opposed to partition by document. From
what i have read in the mailing list so far, it seems like partition by term
is impossible with Lucene. am i right to conclude this! I know Nutch
partitions by document, by in my environment of very limited bandwidth i
would like to avoid partition by document.

-- 
-------------------------------------------------------------------------
ndapandula nakashole
www.cs.uct.ac.za/~nnakasho
-------------------------------------------------------------------------

Re: Index partitioning by term

Posted by Mike Klaas <mi...@gmail.com>.
On 4-Jul-07, at 5:31 AM, Ndapa Nakashole wrote:

> I am considering using Lucene in my mini Grid-based search engine.  
> I would
> like to partition my index by term as opposed to partition by  
> document. From
> what i have read in the mailing list so far, it seems like  
> partition by term
> is impossible with Lucene. am i right to conclude this! I know Nutch
> partitions by document, by in my environment of very limited  
> bandwidth i
> would like to avoid partition by document.

Partitioning indices by term is an approach whose adoption (as far as  
I am away) is limited to academic projects.  Further, it is much more  
bandwdith intensive than the document partition approach (you have to  
do term-posting list intersections across machines, instead of  
locally).  With doc partitioning, you get the top X docs from each  
server, using almost no bandwidth.

-Mike 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Index partitioning by term

Posted by Mathieu Lecarme <ma...@garambrogne.net>.
Ndapa Nakashole a écrit :
> I am considering using Lucene in my mini Grid-based search engine. I
> would
> like to partition my index by term as opposed to partition by
> document. From
> what i have read in the mailing list so far, it seems like partition
> by term
> is impossible with Lucene. am i right to conclude this! I know Nutch
> partitions by document, by in my environment of very limited bandwidth i
> would like to avoid partition by document.
>
Each partitionned index is a full index, if your try to partition by
term, you can refer a Document in an other index.
A cluster with low bandwidth seems strange. Are you using a cluster on
cell phones?!

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org