You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Nikhil Chhaochharia <ni...@yahoo.com> on 2007/09/20 09:50:27 UTC
Multiple Indices vs Single Index
Hi,
I have about 40 indices which range in size from 10MB to 700MB. There are quite a few stored fields. To get an idea of the document size, I have about 400k documents in the 700MB index.
Depending on the query, I choose the index which needs to be searched. Each query hits only one index. I was wondering if creating a single index where every document will have the indexname as a field will be more efficient. I created such an index and it was 3.4 GB in size. My initial performance tests with it are not conclusive.
Also, what are the other points to be addressed while deciding between 1 index and 40 indices.
I have 8GB RAM on the machine.
Thanks,
Nikhil
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Multiple Indices vs Single Index
Posted by Grant Ingersoll <gs...@apache.org>.
If I understand correctly, you want to do a two stage retrieval
right? That is, look up in the initial index (3.4 GB) and then do a
second search on the sub index? Presumably, you have to manage the
Searchers, etc. for each of the sub-indexes as well as the big
index. This means you have to go through the hits from the first
search, then route, etc. correct?
Have you tried creating one single index with all the (stored)
fields, etc? Worst case scenario, assuming 1GB per index, is you
would have a 40GB index, but my guess is index compression will
reduce it more. Since you are less than that anyway, have you tried
just the straightforward solution? Or do you have other requirements
that force the sub-index solution? Also, I am not sure it will work,
but it seems worth a try. Of course, this also depends on how much
you expect your indexes to grow.
Also, what was inconclusive about your tests? Maybe you can describe
more what you have tried to date?
Cheers,
Grant
On Sep 20, 2007, at 3:50 AM, Nikhil Chhaochharia wrote:
> Hi,
>
> I have about 40 indices which range in size from 10MB to 700MB.
> There are quite a few stored fields. To get an idea of the
> document size, I have about 400k documents in the 700MB index.
>
> Depending on the query, I choose the index which needs to be
> searched. Each query hits only one index. I was wondering if
> creating a single index where every document will have the
> indexname as a field will be more efficient. I created such an
> index and it was 3.4 GB in size. My initial performance tests with
> it are not conclusive.
>
> Also, what are the other points to be addressed while deciding
> between 1 index and 40 indices.
>
> I have 8GB RAM on the machine.
>
>
> Thanks,
> Nikhil
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org