You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Randall Tidd <rc...@tidd.cc> on 2016/06/04 16:34:29 UTC

"Point in polygon" search with Lucene / Spatial4j / JTS

Hello,

I have what I think is a relatively simple use case that I’d like to use Lucene to solve.  We have a database of 100,000’s of data points each of which has a latitude and longitude.  I’d like to index these and then be able to search for them with an arbitrary polygon.  For example I would define an irregular polygon roughly encompassing northern San Francisco and search for all points that are within that polygon.

I see some examples that are close to what I need but have some questions.  For example there is this:

https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java <https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java>

But 1) this is based on Lucene 4.x while the latest version is 6.x, and 2) it searches for points within a circle, not a polygon. 

I would be happy to use Lucene 4.x if that provides the best support but that version is getting old now and I wonder if there is better, easier support in later versions.

Doing some reading I see that Spatial4j and JTS provide support for polygons, but I can’t figure out how to define a polygon that can be used with SpatialArgs.  I found com.vividsolutions.jts.geom.GeometryFactory.createPolygon() but am not sure how to get that into a JTS Shape which can be used with SpatialArgs.  Basically I’m getting lost in the 3 sets of API’s (Lucene, Spatial4j, and JTS) wondering how they go together and can’t find an example.

I have also tried parsing arguments to create a polygon like this:

SpatialArgs args = new SpatialArgsParser().parse("IsWithin(POLYGON-122.515193 37.781561, -122.472924 37.809958, -122.383509 37.808795))", ctx);

But I get "java.text.ParseException: Unknown Shape definition” from com.spatial4j.core.io.WKTReader.parse, evidently it is not correctly using JTS to create the polygon.  JTS is on my class path so I’m not sure what is wrong there.

I see mention of some Solr and ElasticSearch solutions, which I believe would just use this Lucene functionality underneath.  I’d be happy to use those if it were easier but seems like I should be able to do this with just Lucene.

Here is the combination of toolkits I’m trying to use:

compile group: 'org.apache.lucene', name: 'lucene-core', version: '4.10.4'
compile group: 'org.apache.lucene', name: 'lucene-analyzers-common', version: '4.10.4'
compile group: 'org.apache.lucene', name: 'lucene-spatial', version: '4.10.4'
compile group: 'com.spatial4j', name: 'spatial4j', version: '0.5'
compile group: 'com.vividsolutions', name: 'jts', version: ‘1.13'

I’ve spent more time figuring this out than I thought I’d have to and think I must be missing something obvious, and am wondering if someone can help me out.

Thanks,
Randy


Re: "Point in polygon" search with Lucene / Spatial4j / JTS

Posted by Michael McCandless <lu...@mikemccandless.com>.
Once 6.1 is out (in a few weeks) the best option Lucene has for polygon
searching is the new LatLonPoint.newPolygonQuery.

It's the fastest option (see our nightly geo benchmarks:
http://home.apache.org/~mikemccand/geobench.html), the API is simple, the
implementation is simple, etc.

It indexes the 2D lat/lon point using Lucene's new (as of 6.0) dimensional
points.

Mike McCandless

http://blog.mikemccandless.com

On Sat, Jun 4, 2016 at 12:34 PM, Randall Tidd <rc...@tidd.cc> wrote:

> Hello,
>
> I have what I think is a relatively simple use case that I’d like to use
> Lucene to solve.  We have a database of 100,000’s of data points each of
> which has a latitude and longitude.  I’d like to index these and then be
> able to search for them with an arbitrary polygon.  For example I would
> define an irregular polygon roughly encompassing northern San Francisco and
> search for all points that are within that polygon.
>
> I see some examples that are close to what I need but have some
> questions.  For example there is this:
>
>
> https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java
> <
> https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java
> >
>
> But 1) this is based on Lucene 4.x while the latest version is 6.x, and 2)
> it searches for points within a circle, not a polygon.
>
> I would be happy to use Lucene 4.x if that provides the best support but
> that version is getting old now and I wonder if there is better, easier
> support in later versions.
>
> Doing some reading I see that Spatial4j and JTS provide support for
> polygons, but I can’t figure out how to define a polygon that can be used
> with SpatialArgs.  I found
> com.vividsolutions.jts.geom.GeometryFactory.createPolygon() but am not sure
> how to get that into a JTS Shape which can be used with SpatialArgs.
> Basically I’m getting lost in the 3 sets of API’s (Lucene, Spatial4j, and
> JTS) wondering how they go together and can’t find an example.
>
> I have also tried parsing arguments to create a polygon like this:
>
> SpatialArgs args = new
> SpatialArgsParser().parse("IsWithin(POLYGON-122.515193 37.781561,
> -122.472924 37.809958, -122.383509 37.808795))", ctx);
>
> But I get "java.text.ParseException: Unknown Shape definition” from
> com.spatial4j.core.io.WKTReader.parse, evidently it is not correctly using
> JTS to create the polygon.  JTS is on my class path so I’m not sure what is
> wrong there.
>
> I see mention of some Solr and ElasticSearch solutions, which I believe
> would just use this Lucene functionality underneath.  I’d be happy to use
> those if it were easier but seems like I should be able to do this with
> just Lucene.
>
> Here is the combination of toolkits I’m trying to use:
>
> compile group: 'org.apache.lucene', name: 'lucene-core', version: '4.10.4'
> compile group: 'org.apache.lucene', name: 'lucene-analyzers-common',
> version: '4.10.4'
> compile group: 'org.apache.lucene', name: 'lucene-spatial', version:
> '4.10.4'
> compile group: 'com.spatial4j', name: 'spatial4j', version: '0.5'
> compile group: 'com.vividsolutions', name: 'jts', version: ‘1.13'
>
> I’ve spent more time figuring this out than I thought I’d have to and
> think I must be missing something obvious, and am wondering if someone can
> help me out.
>
> Thanks,
> Randy
>
>

Re: "Point in polygon" search with Lucene / Spatial4j / JTS

Posted by Michael McCandless <lu...@mikemccandless.com>.
FYI I just pushed an improvement (will be in Lucene 6.2) to Lucene's
Polygon class, to make it easy to construct Polygons from a GeoJSON string
without using an external spatial library:
https://issues.apache.org/jira/browse/LUCENE-7380

That issue just adds a new Polygon.fromGeoJSON(String) static method.

Mike McCandless

http://blog.mikemccandless.com

On Sun, Jun 12, 2016 at 7:16 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> On Wed, Jun 8, 2016 at 10:54 AM, Randall Tidd <rc...@tidd.cc> wrote:
>
>> >>  I see that it still depends on JTS.
>> >
>> > Correction: LatLonPoint most definitely does NOT depend on JTS
>>
>> I’m sorry this is my mistake, I was using
>> org.locationtech.spatial4j.context.jts.JtsSpatialContext in my example
>> which depends on JTS.  However I now realize I’m able to switch to
>> org.locationtech.spatial4j.context.SpatialContext and my example works fine
>> because I’m only indexing points and searching for points within a polygon
>> and LatLonPoint handles this (in conjunction with
>> LatLonPointInPolygonQuery).  Thank you for pointing that out.
>>
>
> Hmm but why do you even need a spatial4j SpatialContext when using
> LatLonPoint?  You shouldn't need spatial4j at all, to do "indexed point in
> polygon query".
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>

Re: "Point in polygon" search with Lucene / Spatial4j / JTS

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Wed, Jun 8, 2016 at 10:54 AM, Randall Tidd <rc...@tidd.cc> wrote:

> >>  I see that it still depends on JTS.
> >
> > Correction: LatLonPoint most definitely does NOT depend on JTS
>
> I’m sorry this is my mistake, I was using
> org.locationtech.spatial4j.context.jts.JtsSpatialContext in my example
> which depends on JTS.  However I now realize I’m able to switch to
> org.locationtech.spatial4j.context.SpatialContext and my example works fine
> because I’m only indexing points and searching for points within a polygon
> and LatLonPoint handles this (in conjunction with
> LatLonPointInPolygonQuery).  Thank you for pointing that out.
>

Hmm but why do you even need a spatial4j SpatialContext when using
LatLonPoint?  You shouldn't need spatial4j at all, to do "indexed point in
polygon query".

Mike McCandless

http://blog.mikemccandless.com

Re: "Point in polygon" search with Lucene / Spatial4j / JTS

Posted by Randall Tidd <rc...@tidd.cc>.
>>  I see that it still depends on JTS.
> 
> Correction: LatLonPoint most definitely does NOT depend on JTS

I’m sorry this is my mistake, I was using org.locationtech.spatial4j.context.jts.JtsSpatialContext in my example which depends on JTS.  However I now realize I’m able to switch to org.locationtech.spatial4j.context.SpatialContext and my example works fine because I’m only indexing points and searching for points within a polygon and LatLonPoint handles this (in conjunction with LatLonPointInPolygonQuery).  Thank you for pointing that out.

Randy

> On Jun 8, 2016, at 10:28 AM, Michael McCandless <lu...@mikemccandless.com> wrote:
> 
> On Tue, Jun 7, 2016 at 3:43 PM, Randall Tidd <rc...@tidd.cc> wrote:
> 
> With LatLonPoint.newPolygonQuery() it looks like I don’t need Spatial4j at
>> all any more either.  This makes my case very simple, I only have to index
>> LatLonPoint’s and then do a query search with
>> LatLonPoint.newPolygonQuery().  I see that it still depends on JTS.
>> 
> 
> Correction: LatLonPoint most definitely does NOT depend on JTS.  It has no
> external dependencies, was designed to have a very simple API, and builds
> on Lucene 6.0's new dimensional points to enable fast "point in
> polygon/distance/box" filters at search time.
> 
> There's also LatLonDocValuesField if you want to sort hits by distance.
> 
> If you need to do more advanced spatial stuff, like indexing shapes, then
> you should look at the spatial-extras module with the JTS dependency.
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: "Point in polygon" search with Lucene / Spatial4j / JTS

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Tue, Jun 7, 2016 at 3:43 PM, Randall Tidd <rc...@tidd.cc> wrote:

With LatLonPoint.newPolygonQuery() it looks like I don’t need Spatial4j at
> all any more either.  This makes my case very simple, I only have to index
> LatLonPoint’s and then do a query search with
> LatLonPoint.newPolygonQuery().  I see that it still depends on JTS.
>

Correction: LatLonPoint most definitely does NOT depend on JTS.  It has no
external dependencies, was designed to have a very simple API, and builds
on Lucene 6.0's new dimensional points to enable fast "point in
polygon/distance/box" filters at search time.

There's also LatLonDocValuesField if you want to sort hits by distance.

If you need to do more advanced spatial stuff, like indexing shapes, then
you should look at the spatial-extras module with the JTS dependency.

Mike McCandless

http://blog.mikemccandless.com

Re: "Point in polygon" search with Lucene / Spatial4j / JTS

Posted by Randall Tidd <rc...@tidd.cc>.
David,

Thank you for the response.  I did indeed stumble upon an example using Lucene 4.x and used it only because that is what I’ve found, I see now that there are examples in lucene-solr 5.x and 6.x.  After your response I did some more research and see all of the work that’s been done on spatial searches in 5x and 6.x.  I think I’ll try to use 6.0.1 and then move to 6.1 when it comes out.

I now see that after the Lucene API “query filter merger” the newer examples use query searches and no filters.  With LatLonPoint.newPolygonQuery() it looks like I don’t need Spatial4j at all any more either.  This makes my case very simple, I only have to index LatLonPoint’s and then do a query search with LatLonPoint.newPolygonQuery().  I see that it still depends on JTS.

By the way while researching this I happened to Google up the talk that you recently gave at Harvard CGA, thanks for that, very timely for what I’m working on!

Thanks again for your help -

Randy

> On Jun 6, 2016, at 1:01 AM, David Smiley <da...@gmail.com> wrote:
> 
> Hello Randy.
> 
> If you are on Lucene 6x, or possibly some late 5x releases, there are newer
> Lucene spatial implementations that have fewer moving parts to them and so
> will be simpler.  I'm almost certain they would be fastest too, although
> perhaps that's not much of an issue with only 100k's of data points.  See
> LatLonPoint in the "sandbox" module, which will likely graduate to
> "spatial" for v6.1.  In particular see the newPolygonQuery method.
> 
> To stay within the SpatialStrategy API (in 6x this is in the
> "spatial-extras" module, in 4x and 5x it was simply the "spatial" module),
> you would approach this like so:
> 
> Use an instance of JtsSpatialContextFactory.GEO or construct an instance if
> you wish to customize settings.  This is called the "spatial context",
> often in a variable of "ctx".  If you reference the latest Spatial4j 0.6
> then you can call ctx.getShapeFactory().polygon() which is a builder for a
> polygon -- I'm sure you'll figure it out.
> https://github.com/locationtech/spatial4j/blob/master/src/main/java/org/locationtech/spatial4j/shape/ShapeFactory.java
> This is for building a shape manually / programatically.  If instead you
> have GeoJSON or WKT then you will find readers for them from the context.
> SpatialArgsParser is another option but I suggest avoiding it.  Spatial4j
> 0.6 will definitely work with Lucene 6x, very likely with 5x, somewhat
> likely with 4x.
> 
> Now you have an instance of a Shape and you can see in the
> SpatialExample.java you referenced how to search by a shape.  For example
> in the filter-by-circle example, swap out the ctx.makeCircle reference with
> your newly constructed polygon shape instance.  Otherwise that's it.  It's
> not clear if you happened upon that example in the 4x branch but are
> unaware it exists in 5x & 6x or wether you deliberately referenced 4x
> because you must use that version.
> 
> Good luck,
> 
> ~ David
> 
> On Sat, Jun 4, 2016 at 12:34 PM Randall Tidd <rc...@tidd.cc> wrote:
> 
>> Hello,
>> 
>> I have what I think is a relatively simple use case that I’d like to use
>> Lucene to solve.  We have a database of 100,000’s of data points each of
>> which has a latitude and longitude.  I’d like to index these and then be
>> able to search for them with an arbitrary polygon.  For example I would
>> define an irregular polygon roughly encompassing northern San Francisco and
>> search for all points that are within that polygon.
>> 
>> I see some examples that are close to what I need but have some
>> questions.  For example there is this:
>> 
>> 
>> https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java
>> <
>> https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java
>>> 
>> 
>> But 1) this is based on Lucene 4.x while the latest version is 6.x, and 2)
>> it searches for points within a circle, not a polygon.
>> 
>> I would be happy to use Lucene 4.x if that provides the best support but
>> that version is getting old now and I wonder if there is better, easier
>> support in later versions.
>> 
>> Doing some reading I see that Spatial4j and JTS provide support for
>> polygons, but I can’t figure out how to define a polygon that can be used
>> with SpatialArgs.  I found
>> com.vividsolutions.jts.geom.GeometryFactory.createPolygon() but am not sure
>> how to get that into a JTS Shape which can be used with SpatialArgs.
>> Basically I’m getting lost in the 3 sets of API’s (Lucene, Spatial4j, and
>> JTS) wondering how they go together and can’t find an example.
>> 
>> I have also tried parsing arguments to create a polygon like this:
>> 
>> SpatialArgs args = new
>> SpatialArgsParser().parse("IsWithin(POLYGON-122.515193 37.781561,
>> -122.472924 37.809958, -122.383509 37.808795))", ctx);
>> 
>> But I get "java.text.ParseException: Unknown Shape definition” from
>> com.spatial4j.core.io.WKTReader.parse, evidently it is not correctly
>> using JTS to create the polygon.  JTS is on my class path so I’m not sure
>> what is wrong there.
>> 
>> I see mention of some Solr and ElasticSearch solutions, which I believe
>> would just use this Lucene functionality underneath.  I’d be happy to use
>> those if it were easier but seems like I should be able to do this with
>> just Lucene.
>> 
>> Here is the combination of toolkits I’m trying to use:
>> 
>> compile group: 'org.apache.lucene', name: 'lucene-core', version: '4.10.4'
>> compile group: 'org.apache.lucene', name: 'lucene-analyzers-common',
>> version: '4.10.4'
>> compile group: 'org.apache.lucene', name: 'lucene-spatial', version:
>> '4.10.4'
>> compile group: 'com.spatial4j', name: 'spatial4j', version: '0.5'
>> compile group: 'com.vividsolutions', name: 'jts', version: ‘1.13'
>> 
>> I’ve spent more time figuring this out than I thought I’d have to and
>> think I must be missing something obvious, and am wondering if someone can
>> help me out.
>> 
>> Thanks,
>> Randy
>> 
>> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: "Point in polygon" search with Lucene / Spatial4j / JTS

Posted by David Smiley <da...@gmail.com>.
Hello Randy.

If you are on Lucene 6x, or possibly some late 5x releases, there are newer
Lucene spatial implementations that have fewer moving parts to them and so
will be simpler.  I'm almost certain they would be fastest too, although
perhaps that's not much of an issue with only 100k's of data points.  See
LatLonPoint in the "sandbox" module, which will likely graduate to
"spatial" for v6.1.  In particular see the newPolygonQuery method.

To stay within the SpatialStrategy API (in 6x this is in the
"spatial-extras" module, in 4x and 5x it was simply the "spatial" module),
you would approach this like so:

Use an instance of JtsSpatialContextFactory.GEO or construct an instance if
you wish to customize settings.  This is called the "spatial context",
often in a variable of "ctx".  If you reference the latest Spatial4j 0.6
then you can call ctx.getShapeFactory().polygon() which is a builder for a
polygon -- I'm sure you'll figure it out.
https://github.com/locationtech/spatial4j/blob/master/src/main/java/org/locationtech/spatial4j/shape/ShapeFactory.java
This is for building a shape manually / programatically.  If instead you
have GeoJSON or WKT then you will find readers for them from the context.
SpatialArgsParser is another option but I suggest avoiding it.  Spatial4j
0.6 will definitely work with Lucene 6x, very likely with 5x, somewhat
likely with 4x.

Now you have an instance of a Shape and you can see in the
SpatialExample.java you referenced how to search by a shape.  For example
in the filter-by-circle example, swap out the ctx.makeCircle reference with
your newly constructed polygon shape instance.  Otherwise that's it.  It's
not clear if you happened upon that example in the 4x branch but are
unaware it exists in 5x & 6x or wether you deliberately referenced 4x
because you must use that version.

Good luck,

~ David

On Sat, Jun 4, 2016 at 12:34 PM Randall Tidd <rc...@tidd.cc> wrote:

> Hello,
>
> I have what I think is a relatively simple use case that I’d like to use
> Lucene to solve.  We have a database of 100,000’s of data points each of
> which has a latitude and longitude.  I’d like to index these and then be
> able to search for them with an arbitrary polygon.  For example I would
> define an irregular polygon roughly encompassing northern San Francisco and
> search for all points that are within that polygon.
>
> I see some examples that are close to what I need but have some
> questions.  For example there is this:
>
>
> https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java
> <
> https://github.com/apache/lucene-solr/blob/branch_4x/lucene/spatial/src/test/org/apache/lucene/spatial/SpatialExample.java
> >
>
> But 1) this is based on Lucene 4.x while the latest version is 6.x, and 2)
> it searches for points within a circle, not a polygon.
>
> I would be happy to use Lucene 4.x if that provides the best support but
> that version is getting old now and I wonder if there is better, easier
> support in later versions.
>
> Doing some reading I see that Spatial4j and JTS provide support for
> polygons, but I can’t figure out how to define a polygon that can be used
> with SpatialArgs.  I found
> com.vividsolutions.jts.geom.GeometryFactory.createPolygon() but am not sure
> how to get that into a JTS Shape which can be used with SpatialArgs.
> Basically I’m getting lost in the 3 sets of API’s (Lucene, Spatial4j, and
> JTS) wondering how they go together and can’t find an example.
>
> I have also tried parsing arguments to create a polygon like this:
>
> SpatialArgs args = new
> SpatialArgsParser().parse("IsWithin(POLYGON-122.515193 37.781561,
> -122.472924 37.809958, -122.383509 37.808795))", ctx);
>
> But I get "java.text.ParseException: Unknown Shape definition” from
> com.spatial4j.core.io.WKTReader.parse, evidently it is not correctly
> using JTS to create the polygon.  JTS is on my class path so I’m not sure
> what is wrong there.
>
> I see mention of some Solr and ElasticSearch solutions, which I believe
> would just use this Lucene functionality underneath.  I’d be happy to use
> those if it were easier but seems like I should be able to do this with
> just Lucene.
>
> Here is the combination of toolkits I’m trying to use:
>
> compile group: 'org.apache.lucene', name: 'lucene-core', version: '4.10.4'
> compile group: 'org.apache.lucene', name: 'lucene-analyzers-common',
> version: '4.10.4'
> compile group: 'org.apache.lucene', name: 'lucene-spatial', version:
> '4.10.4'
> compile group: 'com.spatial4j', name: 'spatial4j', version: '0.5'
> compile group: 'com.vividsolutions', name: 'jts', version: ‘1.13'
>
> I’ve spent more time figuring this out than I thought I’d have to and
> think I must be missing something obvious, and am wondering if someone can
> help me out.
>
> Thanks,
> Randy
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com