You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by eoey <eo...@silverspringnet.com> on 2010/10/15 02:34:03 UTC

Optimizing multi fields searches

I have documents with 10 indexed fields and 10 million documents. 
One of the index fields has ID-type value which is almost unique. when I
search with that almost-unique field it comes up in 12ms, but if I add
another criteria to the search it's very slow, like 5 secs. 

I don't use QueryParser but I created my own queries with BooleanQuery of
course. Is there anyway to optimize this? How can I influence Lucene to
search index that yields less documents first and followed by the other
indexes (the same concept used by some RDBMS that collects indexes
statistics/histogram)?

Thanks.
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Optimizing-multi-fields-searches-tp1704949p1704949.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Optimizing multi fields searches

Posted by Danil Ε’ORIN <to...@gmail.com>.
Try to create a Filter on ID-like field, and use it in search() method.
It should be much faster.

On Fri, Oct 15, 2010 at 03:34, eoey <eo...@silverspringnet.com> wrote:
>
> I have documents with 10 indexed fields and 10 million documents.
> One of the index fields has ID-type value which is almost unique. when I
> search with that almost-unique field it comes up in 12ms, but if I add
> another criteria to the search it's very slow, like 5 secs.
>
> I don't use QueryParser but I created my own queries with BooleanQuery of
> course. Is there anyway to optimize this? How can I influence Lucene to
> search index that yields less documents first and followed by the other
> indexes (the same concept used by some RDBMS that collects indexes
> statistics/histogram)?
>
> Thanks.
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Optimizing-multi-fields-searches-tp1704949p1704949.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

Re: Optimizing multi fields searches

Posted by Chris Hostetter <ho...@fucit.org>.
: I have documents with 10 indexed fields and 10 million documents. 
: One of the index fields has ID-type value which is almost unique. when I
: search with that almost-unique field it comes up in 12ms, but if I add
: another criteria to the search it's very slow, like 5 secs. 
: 
: I don't use QueryParser but I created my own queries with BooleanQuery of
: course. Is there anyway to optimize this? How can I influence Lucene to
: search index that yields less documents first and followed by the other
: indexes (the same concept used by some RDBMS that collects indexes
: statistics/histogram)?

BooleanQuery should already do this ... if you look into the code, the 
"skipTo" methods in the ConjunctionScorer API are where the logic takes 
place -- each BooleanClause is asked to find the "first" document that it 
matches and then the others are told to "skipTo" the first match they have 
which that doc or later -- so a query that matches only one doc will 
result in the other clauses skipping to that doc.

Since you haven't posted any code, it's hard to guess what might be going 
wrong for you -- but off the top of my head i can't help but wonder if you 
remembered to construct your BooleanClauses with Order.MUST (if you use 
Order.SHOULD then docs which match *either* clause will match ... and it's 
totally understandable that the query might be much slower)


-Hoss