You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Niels Ott <no...@sfs.uni-tuebingen.de> on 2009/03/14 13:38:16 UTC

Speeding up RangeQueries?

Hi all,

I'm working on my prototype system and it turns out that RangeQueries 
are quite slow. In a first test I have about 80.000 documents in my 
index and I combine two range queries with a normal text query using the 
BooleanQuery.

On the long run I will need to enhance my index at indexing-time so that 
the range queries will be substituted by simple keywords.

For now, I'm interested in a possibility to speed up range queries. Does 
the performance of a range query depend on the length of contents in the 
field in question?

Best,

    Niels

-- 
Niels Ott
Computational Linguist (B.A.)
http://www.drni.de/niels/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Speeding up RangeQueries?

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Sat, Mar 14, 2009 at 11:37 AM, Niels Ott <no...@sfs.uni-tuebingen.de> wrote:
> As far as I understand this is only available from the unreleased
> development version, right? How safe is this version for use?
>
> Is it possible to use only the org.apache.lucene.search.trie package from
> there together with the old and stable Lucene?

It's unreleased, so the API could end up changing a little, but it's
very well tested already and should be independent of the rest of
Lucene (so yes, you should be able to just grab the trie package and
use with the latest official Lucene release).


-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Speeding up RangeQueries?

Posted by Niels Ott <no...@sfs.uni-tuebingen.de>.
Hi Paul,

Paul Elschot schrieb:
> Performance normally mostly depends on the number of terms indexed within
> the queried range. To limit the number of terms used during a range search,
> have a look here for more info on the new TrieRangeQuery:
> http://wiki.apache.org/lucene-java/SearchNumericalFields

This looks very promising.

As far as I understand this is only available from the unreleased 
development version, right? How safe is this version for use?

Is it possible to use only the org.apache.lucene.search.trie package 
from there together with the old and stable Lucene?

Best

    Niels

-- 
Niels Ott
Computational Linguist (B.A.)
http://www.drni.de/niels/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Speeding up RangeQueries?

Posted by Paul Elschot <pa...@xs4all.nl>.
On Saturday 14 March 2009 13:38:16 Niels Ott wrote:
> Hi all,
> 
> I'm working on my prototype system and it turns out that RangeQueries 
> are quite slow. In a first test I have about 80.000 documents in my 
> index and I combine two range queries with a normal text query using the 
> BooleanQuery.
> 
> On the long run I will need to enhance my index at indexing-time so that 
> the range queries will be substituted by simple keywords.

Perhaps that is avoidable, see the reference below.

> For now, I'm interested in a possibility to speed up range queries. Does 
> the performance of a range query depend on the length of contents in the 
> field in question?

Performance normally mostly depends on the number of terms indexed within
the queried range. To limit the number of terms used during a range search,
have a look here for more info on the new TrieRangeQuery:
http://wiki.apache.org/lucene-java/SearchNumericalFields

Regards,
Paul Elschot

Re: Speeding up RangeQueries?

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Sat, Mar 14, 2009 at 8:38 AM, Niels Ott <no...@sfs.uni-tuebingen.de> wrote:
> For now, I'm interested in a possibility to speed up range queries. Does the
> performance of a range query depend on the length of contents in the field
> in question?

Usually the biggest factor is the number of terms in the range.  The
second biggest is the number of documents that term points to (i.e.
the number of documents containing that term).

For single-valued numeric or date fields, TrieRangeQuery in Lucene
trunk will speed up range queries.

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org