You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by parth_n <na...@asu.edu> on 2014/06/08 07:10:28 UTC
Lucene Spatial Question: How to retrieve all results within a
bounding box?
Hi everyone,
I am trying to retrieve all results within a given bounding box in a 2-D
space. I understand that the scoring function is based on the distance from
the center of the query. I am not looking to retrieve top-k results, but all
of them.
I have read previous forums on this similar question, and the solutions are
either out-dated (for previous versions) or inefficient (Option 1: input k
as INTEGER.MAX_VALUE, Option 2: use a TotalHitCountCollector and get the
total number of results using getTotalHits and then pass on this number to
the top-k search).
I am looking for all the results in the bounding box, and do not care for
the order. I do not want to waste any computation, if possible, on any
sorting needed for top-k functionality.
Question: Is there any better solution out there that I can use instead of
the above mentioned solutions?
Any reply is much appreciated. Thanks!
Snippet of the code of the above mentioned Option 1:
SpatialArgs args = new SpatialArgs(SpatialOperation.IsWithin,
ctx.makeRectangle(minX, maxX, minY, maxY));
Filter filter = strategy.makeFilter(args);
TopDocs topDocs = searcher.search(new MatchAllDocsQuery(), filter,
Integer.MAX_VALUE);
ScoreDoc[] scoreDocs = topDocs.scoreDocs;
for (ScoreDoc s : scoreDocs)
{
Document doc = searcher.doc(s.doc);
System.out.println(doc.get("id") + "\t" + doc.get("name"));
}
--
View this message in context: http://lucene.472066.n3.nabble.com/Lucene-Spatial-Question-How-to-retrieve-all-results-within-a-bounding-box-tp4140616.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene Spatial Question: How to retrieve all results within a
bounding box?
Posted by "david.w.smiley@gmail.com" <da...@gmail.com>.
Yes; as I said in my last sentence: "You’ll see a difference of Document vs
StoredDocument with 4x”.
As to SimpleCollector not being in 4x (I didn’t check but I’ll take your
word for it) — the bottom line is that you need to write a Collector, and a
simple one at that.
~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley
On Sun, Jun 8, 2014 at 5:00 PM, parth_n <na...@asu.edu> wrote:
> Thanks a lot for the reply David!
>
> I am having some problems executing this code. I am using 4.8.1. I tried
> looking for StoredDocument and SimpleCollector in the source code but
> couldn't find them. Am I missing something?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Lucene-Spatial-Question-How-to-efficiently-retrieve-all-results-within-a-bounding-box-tp4140616p4140673.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Lucene Spatial Question: How to retrieve all results within a
bounding box?
Posted by parth_n <na...@asu.edu>.
Thanks a lot for the reply David!
I am having some problems executing this code. I am using 4.8.1. I tried
looking for StoredDocument and SimpleCollector in the source code but
couldn't find them. Am I missing something?
--
View this message in context: http://lucene.472066.n3.nabble.com/Lucene-Spatial-Question-How-to-efficiently-retrieve-all-results-within-a-bounding-box-tp4140616p4140673.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene Spatial Question: How to retrieve all results within a
bounding box?
Posted by "david.w.smiley@gmail.com" <da...@gmail.com>.
Hi.
Your question is actually not particularly spatial; it’s more
circumstantial to your particular query. You want to know how to do a query
and collect *all* the results, in no particular order. To do this
efficiently, you need to use a Collector. Also, I noticed you are using
the “IsWithin” predicate. If all of your data consists of points, then
“Intersects” is semantically equivalent and faster. Here’s some sample
code I temporarily threw into SpatialExample.java that works on Lucene
trunk. You’ll see a difference of Document vs StoredDocument with 4x:
{
SpatialArgs args = new SpatialArgs(SpatialOperation.IsWithin,
ctx.makeRectangle(-90, -60, 30, 40));
indexSearcher.search(strategy.makeQuery(args),
new SimpleCollector() {
public AtomicReader reader;
@Override
public boolean acceptsDocsOutOfOrder() {
return true;
}
@Override
protected void doSetNextReader(AtomicReaderContext context)
throws IOException {
this.reader = context.reader();
}
@Override
public void collect(int docId) throws IOException {
StoredDocument doc = reader.document(docId);
System.out.println(doc.get("id") + "\t" +
doc.get("myGeoField"));
}
});
}
~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley
On Sun, Jun 8, 2014 at 1:10 AM, parth_n <na...@asu.edu> wrote:
> Hi everyone,
>
> I am trying to retrieve all results within a given bounding box in a 2-D
> space. I understand that the scoring function is based on the distance from
> the center of the query. I am not looking to retrieve top-k results, but
> all
> of them.
>
> I have read previous forums on this similar question, and the solutions are
> either out-dated (for previous versions) or inefficient (Option 1: input k
> as INTEGER.MAX_VALUE, Option 2: use a TotalHitCountCollector and get the
> total number of results using getTotalHits and then pass on this number to
> the top-k search).
>
> I am looking for all the results in the bounding box, and do not care for
> the order. I do not want to waste any computation, if possible, on any
> sorting needed for top-k functionality.
>
> Question: Is there any better solution out there that I can use instead of
> the above mentioned solutions?
>
> Any reply is much appreciated. Thanks!
>
>
> Snippet of the code of the above mentioned Option 1:
>
> SpatialArgs args = new SpatialArgs(SpatialOperation.IsWithin,
> ctx.makeRectangle(minX, maxX, minY, maxY));
>
> Filter filter = strategy.makeFilter(args);
> TopDocs topDocs = searcher.search(new MatchAllDocsQuery(), filter,
> Integer.MAX_VALUE);
>
> ScoreDoc[] scoreDocs = topDocs.scoreDocs;
> for (ScoreDoc s : scoreDocs)
> {
> Document doc = searcher.doc(s.doc);
> System.out.println(doc.get("id") + "\t" + doc.get("name"));
> }
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Lucene-Spatial-Question-How-to-retrieve-all-results-within-a-bounding-box-tp4140616.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>