You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by sp...@gmx.eu on 2009/02/14 14:26:30 UTC

Multiple indexes vs single index

Hi,

We have have an application which manages the data of multiple customers.
A customer can only search its own data, never the data of other customers.

So what is more efficent in respect of performance and resources:

One big single index filtered by an index field (customer-Id) or multiple
smaller indexes, one per customer?

I think there will be 10 million docs max. for all customers together.

Thank you


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Multiple indexes vs single index

Posted by Shashi Kant <sh...@yahoo.com>.
Take a look at Solr - it should be able to handle the scale you describe. My suggestion is not to partition indexes unless absolutely have to.




----- Original Message ----
From: "spring@gmx.eu" <sp...@gmx.eu>
To: java-user@lucene.apache.org
Sent: Saturday, February 14, 2009 10:27:58 AM
Subject: RE: Multiple indexes vs single index

Hi,

> You get one answer if each document is 1K, another if it's
> 1G. If you have 2 users or 10,000 users. If you require
> 100 queries/sec response time or 1 query can take 10
> seconds. If you require an update to the index every
> second or month...

Each doc has up to 10 A4 pages text.
There will be about 100 customers/clients/companies (not users, every
customer will have about 10 users).
I would expect 1 query/s not more.
No updates to the index.

> You have two problems with maintaining one index/user.
> 1> Trying to maintain N indexes is much harder than one,
>      especially when you factor in backups, etc.

This is the biggest problem I see.

> 2> There is a cost to opening an index. If you look at the
>      Wiki you'll see that the recommendation is that you
>      open an index, and run a few warmup queries to fill
>      caches etc. before, for instance, measuring performance.
>      So if you maintain an index/user, how do you expect
>      to handle this issue?

I would open the index on demand and close it after a period of inactivity.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Multiple indexes vs single index

Posted by Chris Lu <ch...@gmail.com>.
A normal Lucene index should be able to handle it.

As long as no frequent insert/update, which can sometimes cause hiccups for
large indexes, one index is enough.

If your customer numbers keep growing, you will need to have one index for
each customer, which isn't that difficult really, especially your QPS is not
so demanding.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Sat, Feb 14, 2009 at 7:27 AM, <sp...@gmx.eu> wrote:

> Hi,
>
> > You get one answer if each document is 1K, another if it's
> > 1G. If you have 2 users or 10,000 users. If you require
> > 100 queries/sec response time or 1 query can take 10
> > seconds. If you require an update to the index every
> > second or month...
>
> Each doc has up to 10 A4 pages text.
> There will be about 100 customers/clients/companies (not users, every
> customer will have about 10 users).
> I would expect 1 query/s not more.
> No updates to the index.
>
> > You have two problems with maintaining one index/user.
> > 1> Trying to maintain N indexes is much harder than one,
> >      especially when you factor in backups, etc.
>
> This is the biggest problem I see.
>
> > 2> There is a cost to opening an index. If you look at the
> >      Wiki you'll see that the recommendation is that you
> >      open an index, and run a few warmup queries to fill
> >      caches etc. before, for instance, measuring performance.
> >      So if you maintain an index/user, how do you expect
> >      to handle this issue?
>
> I would open the index on demand and close it after a period of inactivity.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

RE: Multiple indexes vs single index

Posted by sp...@gmx.eu.
Hi,

> You get one answer if each document is 1K, another if it's
> 1G. If you have 2 users or 10,000 users. If you require
> 100 queries/sec response time or 1 query can take 10
> seconds. If you require an update to the index every
> second or month...

Each doc has up to 10 A4 pages text.
There will be about 100 customers/clients/companies (not users, every
customer will have about 10 users).
I would expect 1 query/s not more.
No updates to the index.

> You have two problems with maintaining one index/user.
> 1> Trying to maintain N indexes is much harder than one,
>      especially when you factor in backups, etc.

This is the biggest problem I see.

> 2> There is a cost to opening an index. If you look at the
>      Wiki you'll see that the recommendation is that you
>      open an index, and run a few warmup queries to fill
>      caches etc. before, for instance, measuring performance.
>      So if you maintain an index/user, how do you expect
>      to handle this issue?

I would open the index on demand and close it after a period of inactivity.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Multiple indexes vs single index

Posted by Erick Erickson <er...@gmail.com>.
Define efficiency. Define document. Define user. Define....

This kind of question is unanswerable except in gross
generalities unless you take the time to provide details.

You get one answer if each document is 1K, another if it's
1G. If you have 2 users or 10,000 users. If you require
100 queries/sec response time or 1 query can take 10
seconds. If you require an update to the index every
second or month...


That said, I'd try one index first and run some performance
measurements. See the Wiki for performance tuning issues.

You have two problems with maintaining one index/user.
1> Trying to maintain N indexes is much harder than one,
     especially when you factor in backups, etc.
2> There is a cost to opening an index. If you look at the
     Wiki you'll see that the recommendation is that you
     open an index, and run a few warmup queries to fill
     caches etc. before, for instance, measuring performance.
     So if you maintain an index/user, how do you expect
     to handle this issue?


Best
Erick



On Sat, Feb 14, 2009 at 8:26 AM, <sp...@gmx.eu> wrote:

> Hi,
>
> We have have an application which manages the data of multiple customers.
> A customer can only search its own data, never the data of other customers.
>
> So what is more efficent in respect of performance and resources:
>
> One big single index filtered by an index field (customer-Id) or multiple
> smaller indexes, one per customer?
>
> I think there will be 10 million docs max. for all customers together.
>
> Thank you
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>