You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Nicholas Knize (JIRA)" <ji...@apache.org> on 2018/07/12 19:26:00 UTC

[jira] [Comment Edited] (LUCENE-8396) Add Points Based Shape Indexing

    [ https://issues.apache.org/jira/browse/LUCENE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542052#comment-16542052 ] 

Nicholas Knize edited comment on LUCENE-8396 at 7/12/18 7:25 PM:
-----------------------------------------------------------------

I've attached a WIP patch for initial feedback.

Patch summary:

* adds a new `LatLonShape` abstract class that uses factory methods to index and search.
* `createIndexableFields` static method creates an array of `TriangleField` objects
* to index the shape: iterate over the `TriangleField` array calling `document.add` which will add each tessellated triangle to the index
* like `LatLonPoint` the `LatLonShape.newBoxQuery` method will create a new `LatLonShapeBoundingBoxQuery`
* finally, there is a `TestLatLonShapeQueries` class that is very similar to `BaseGeoPointTestCase` in that it will test 3 different sized random data sets (Tiny, Medium, Big)
* along with `geo.TestTessellator` and `document.TestLatLonShape` that contain some explicit testing for each of those classes

Javadocs are included. There are many todos, such as generalizing the bounding box query into a generic shape query that will query indexed shapes with provided query shapes. I think this can be done rather easily by adding a {{relateTriangle}} utility method to {{Polygon2D}} for computing relations of indexed triangles with the target shape. That, however, will be left as a future feature.


was (Author: nknize):
I've attached a WIP patch for initial feedback.

Patch summary:

* adds a new `LatLonShape` abstract class that uses factory methods to index and search.
* `createIndexableFields` static method creates an array of `TriangleField` objects
* to index the shape: iterate over the `TriangleField` array calling `document.add` which will add each tessellated triangle to the index
* like `LatLonPoint` the `LatLonShape.newBoxQuery` method will create a new `LatLonShapeBoundingBoxQuery`
* finally, there is a `TestLatLonPolygonQueries` class that is very similar to `BaseGeoPointTestCase` in that it will test 3 different sized random data sets (Tiny, Medium, Big)
* along with `geo.TestTessellator` and `document.TestLatLonPolygon` that contain some explicit testing for each of those classes

Javadocs are included. There are many todos, such as generalizing the bounding box query into a generic shape query that will query indexed shapes with provided query shapes. I think this can be done rather easily by adding a {{relateTriangle}} utility method to {{Polygon2D}} for computing relations of indexed triangles with the target shape. That, however, will be left as a future feature.

> Add Points Based Shape Indexing
> -------------------------------
>
>                 Key: LUCENE-8396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8396
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Nicholas Knize
>            Priority: Major
>         Attachments: LUCENE-8396.patch, polyWHole.png, tessellatedPoly.png
>
>
> I've been tinkering with this for a while and would like to solicit some feedback. I'd like to introduce a new shape field based on the BKD/Points codec to bring much of the Points based performance improvements to the shape indexing and search usecase. Much like the existing shape indexing in {{spatial-extras}} the shape will be decomposed into smaller parts, but instead of decomposing into quad cells (which have the drawback of precision accuracy and sheer volume of terms) I'd like to explore decomposing the shapes into a triangular mesh; similar to gaming and computer graphics. Not only does this approach reduce the number of terms, but it has the added benefit of better accuracy (precision is based on the index encoding technique instead of the spatial resolution of the quad cell). 
> For better clarity, consider the following illustrations (of a polygon in a 1 degree x 1 degree spatial area).  The first is using the quad tree technique applied in the existing inverted index. The second is using a triangular mesh decomposition as used by popular OpenGL and javascript rendering systems (such as those used by mapbox).
> !polyWHole.png!
> Decomposing this shape using a quad tree results in 1,105,889 quad terms at 3 meter spatial resolution.
> !tessellatedPoly.png!
>  
> Decomposing using a triangular mesh results in 8 triangles at the same resolution as {{encodeLat/Lon}}.
> The decomposed triangles can then be encoded as a 6 dimensional POINT and queries are implemented using the computed relations against these triangles (similar to how its done with the inverted index today).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org