You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Kushal Dave <ks...@gmail.com> on 2012/03/05 22:46:51 UTC

Filter the search based on the subset of docids.

I have an ID field that contains about 100,000 unique ids. If I want to
query all records with ids [1-100], How should I be doing this?

I tried doing it the following way:
------------------------------------------------------------
                        Query qry = new MultiFieldQueryParser( fields,
analyzer ).parse( query );
                        RangeFilter rf3=new RangeFilter("id","1", "100",
true,true);
                        FilteredQuery fq3=new FilteredQuery(qry, rf3);

                        // Search and get number of hits
                        TopDocCollector collector = new TopDocCollector(
maxHits );
                        indexSearcher.search( fq3, collector );
-----------------------------------------------------------
As the id field is indexed lexicographically, 1 to 100 does not quite do
what it is supposed to include. and it only returns docs that fall in the
lexicographic range (1, 10, 100) instead of the range (1, 2, 3, ... 99, 100)


Also, at the time of indexing the id field is stored but not analyzed.
luceneDoc.add( new Field( "id", id, Field.Store.YES,
Field.Index.NOT_ANALYZED ) );

I am using the lucene-2.4.1 api.

I apologize if this was a trivial question and has been answered previously
(but I tried searching for it before posting here.)
Any help is greatly appreciated.

Thanks,
Kushal

Re: Filter the search based on the subset of docids.

Posted by Ian Lea <ia...@gmail.com>.
You'll need to pad your ids to make this work.

000001
000002
etc.

with a length to match the max you require, now or in the future.

Or, better, upgrade to a recent release and use NumericField.


--
Ian.


On Mon, Mar 5, 2012 at 9:46 PM, Kushal Dave <ks...@gmail.com> wrote:
> I have an ID field that contains about 100,000 unique ids. If I want to
> query all records with ids [1-100], How should I be doing this?
>
> I tried doing it the following way:
> ------------------------------------------------------------
>                        Query qry = new MultiFieldQueryParser( fields,
> analyzer ).parse( query );
>                        RangeFilter rf3=new RangeFilter("id","1", "100",
> true,true);
>                        FilteredQuery fq3=new FilteredQuery(qry, rf3);
>
>                        // Search and get number of hits
>                        TopDocCollector collector = new TopDocCollector(
> maxHits );
>                        indexSearcher.search( fq3, collector );
> -----------------------------------------------------------
> As the id field is indexed lexicographically, 1 to 100 does not quite do
> what it is supposed to include. and it only returns docs that fall in the
> lexicographic range (1, 10, 100) instead of the range (1, 2, 3, ... 99, 100)
>
>
> Also, at the time of indexing the id field is stored but not analyzed.
> luceneDoc.add( new Field( "id", id, Field.Store.YES,
> Field.Index.NOT_ANALYZED ) );
>
> I am using the lucene-2.4.1 api.
>
> I apologize if this was a trivial question and has been answered previously
> (but I tried searching for it before posting here.)
> Any help is greatly appreciated.
>
> Thanks,
> Kushal

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org