You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Trond Aksel Myklebust <tr...@stud.ntnu.no> on 2005/10/07 23:41:15 UTC

Filtering of query based on field ID

As a new user of Lucene I wonder if anyone can suggest how to do this query
the fastest way:

 

In the index I got a "content" field with the data, and a "pathID" field
containing an ID ranging from 0 up to possible millions. 

In some cases I want to return the documents containing (for example) the
term "xml" for all documents, in other cases the ones with specific pathIDs.

 

My question is: What is the fastest way to retrieve documents containing (as
an example) the term "xml" with pathID between 0 and 100000 (with holes in
the range)?

As I know of, I could either:

 

1.	do a query: pathID:(1 2 3 5 6 ..) AND content:xml
2.	do a rangequery: pathID:({0 4} {4 7}) AND content:xml
3.	do the query: content:xml and in a loop filter out unwanted
documents by checking the pathID

 

Is there any other way, and what would be the best approach?

 

Regards

Trond Myklebust

University of Trondheim, Norway

 

 


Re: Filtering of query based on field ID

Posted by Chris Hostetter <ho...@fucit.org>.
: As a new user of Lucene I wonder if anyone can suggest how to do this query
: the fastest way:

Use a RangeFilter instead of a RangeQuery ... the performance will be much better.

keep in mind that either way you do it, you'll need to store your pathIds
with some sort of padding so that they sort lexigraphically, otherwise
when you ask for "1 to 1000" you won't get any numbers that don't start
with the digit "1".


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org