You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ma...@ibsbe.be on 2007/03/28 13:24:36 UTC
Item Search Database
hi,
i have a performance question...
we need to implement a feature called 'Item Search Database', which
basically means we have to limit the documents a user can search ...
example :
Item1 is in database1
item2 is in database2
item3 is in database1 and database2
and the client can only see the items in database1
we currently solve this by making a new solrcolumn for each
searchdatabase... so it looks like this :
ITEMNAME DB1 DB2
----------------- ------ ------
Item1 true false
Item2 false true
Item3 true true
and we limit the result of a search by putting "db1:true" in the
querystring
but i have been reading about another method :
we could also use just one solrcolum and put the names of the database in
it...
like so :
ITEMNAME DB
----------------- -----
Item1 DB1
Item2 DB2
Item3 DB1 DB2
and limit the results by putting 'db:db1' in the querystring
and now for my question :
which of these options will be more performant ?
my guess is the first option will be the most performant since the indexes
will be better constructed
but i would really like a professional opinion on this ...
as i said, we are currently using the first option on 300.000 testrecords
and it is really performant.
some SearchDatabases have only 12 records in it and it takes less then 1ms
to get those 12 records back... so i'm guessing Solr is not searching the
full 300.000 records and i am kind of afraid that with the second option
Solr will have to search more records/indexes to get the same result...
well, hope you understand my question and thanks in advance !
- Maarten
PS: thank you to everybody on this list for the help and thank you to all
of the Solr/Lucene developers, great stuff !!
Re: Item Search Database
Posted by Yonik Seeley <yo...@apache.org>.
On 3/28/07, Maarten.De.Vilder@ibsbe.be <Ma...@ibsbe.be> wrote:
> we need to implement a feature called 'Item Search Database', which
> basically means we have to limit the documents a user can search ...
>
> example :
> Item1 is in database1
> item2 is in database2
> item3 is in database1 and database2
> and the client can only see the items in database1
>
> we currently solve this by making a new solrcolumn for each
> searchdatabase... so it looks like this :
> ITEMNAME DB1 DB2
> ----------------- ------ ------
> Item1 true false
> Item2 false true
> Item3 true true
>
> and we limit the result of a search by putting "db1:true" in the
> querystring
>
> but i have been reading about another method :
> we could also use just one solrcolum and put the names of the database in
> it...
> like so :
> ITEMNAME DB
> ----------------- -----
> Item1 DB1
> Item2 DB2
> Item3 DB1 DB2
>
> and limit the results by putting 'db:db1' in the querystring
>
> and now for my question :
> which of these options will be more performant ?
They should both be roughly equal.
Lucene maintains an inverted index... a term points to all the
document id's containing that term... so it doesn't really matter if
the term is "db:db1" or "db1:true".
The 2nd way with a single field seems more extensible and future-proof though.
If you really want a speedup, pull out the restriction into a filter:
q=foo&fq=db:db1
The filter will be cached independently of the query, resulting in
much less work for every subsequent query that reuses that filter.
-Yonik
> my guess is the first option will be the most performant since the indexes
> will be better constructed
> but i would really like a professional opinion on this ...
>
> as i said, we are currently using the first option on 300.000 testrecords
> and it is really performant.
> some SearchDatabases have only 12 records in it and it takes less then 1ms
> to get those 12 records back... so i'm guessing Solr is not searching the
> full 300.000 records and i am kind of afraid that with the second option
> Solr will have to search more records/indexes to get the same result...
>
> well, hope you understand my question and thanks in advance !
> - Maarten
>
> PS: thank you to everybody on this list for the help and thank you to all
> of the Solr/Lucene developers, great stuff !!