You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ndapa Nakashole <nd...@gmail.com> on 2007/07/04 14:31:16 UTC
Index partitioning by term
I am considering using Lucene in my mini Grid-based search engine. I would
like to partition my index by term as opposed to partition by document. From
what i have read in the mailing list so far, it seems like partition by term
is impossible with Lucene. am i right to conclude this! I know Nutch
partitions by document, by in my environment of very limited bandwidth i
would like to avoid partition by document.
--
-------------------------------------------------------------------------
ndapandula nakashole
www.cs.uct.ac.za/~nnakasho
-------------------------------------------------------------------------
Re: Index partitioning by term
Posted by Mike Klaas <mi...@gmail.com>.
On 4-Jul-07, at 5:31 AM, Ndapa Nakashole wrote:
> I am considering using Lucene in my mini Grid-based search engine.
> I would
> like to partition my index by term as opposed to partition by
> document. From
> what i have read in the mailing list so far, it seems like
> partition by term
> is impossible with Lucene. am i right to conclude this! I know Nutch
> partitions by document, by in my environment of very limited
> bandwidth i
> would like to avoid partition by document.
Partitioning indices by term is an approach whose adoption (as far as
I am away) is limited to academic projects. Further, it is much more
bandwdith intensive than the document partition approach (you have to
do term-posting list intersections across machines, instead of
locally). With doc partitioning, you get the top X docs from each
server, using almost no bandwidth.
-Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Index partitioning by term
Posted by Mathieu Lecarme <ma...@garambrogne.net>.
Ndapa Nakashole a écrit :
> I am considering using Lucene in my mini Grid-based search engine. I
> would
> like to partition my index by term as opposed to partition by
> document. From
> what i have read in the mailing list so far, it seems like partition
> by term
> is impossible with Lucene. am i right to conclude this! I know Nutch
> partitions by document, by in my environment of very limited bandwidth i
> would like to avoid partition by document.
>
Each partitionned index is a full index, if your try to partition by
term, you can refer a Document in an other index.
A cluster with low bandwidth seems strange. Are you using a cluster on
cell phones?!
M.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org