You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/08/09 17:52:00 UTC

[jira] [Commented] (LUCENE-10654) New companion doc value format for LatLonShape and XYShape field types

    [ https://issues.apache.org/jira/browse/LUCENE-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17577561#comment-17577561 ] 

ASF subversion and git services commented on LUCENE-10654:
----------------------------------------------------------

Commit d7fd48c9502c567e4760a011fa99b1a491fea2cb in lucene's branch refs/heads/main from Nick Knize
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=d7fd48c9502 ]

LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape (#1017)

Adds new doc value field to support LatLonShape and XYShape doc values. The
implementation is inspired by ComponentTree. A binary tree of tessellated
components (point, line, or triangle) is created. This tree is then DFS
serialized to a variable compressed DataOutput buffer to keep the doc value
format as compact as possible.

DocValue queries are performed on the serialized tree using a similar component
relation logic as found in SpatialQuery for BKD indexed shapes. To make this
possible some of the relation logic is refactored to make it accessible to the
doc value query counterpart.

Note this does not support the following:

* Multi Geometries or Collections - This will be investigated by exploring 
  the addition of multi binary doc values.
* General Geometry Queries - This will be added in a follow on improvement. 

Signed-off-by: Nicholas Walter Knize <nk...@apache.org>

> New companion doc value format for LatLonShape and XYShape field types
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-10654
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10654
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Nick Knize
>            Priority: Major
>             Fix For: 9.4
>
>          Time Spent: 7h
>  Remaining Estimate: 0h
>
> {{XYDocValuesField}} provides doc value support for {{XYPoint}}. 
> {{LatLonDocValuesField}} provides docvalue support for {{LatLonPoint}}.
> However, neither {{LatLonShape}} nor {{XYShape}} currently have a docvalue format. 
> This lack of doc value support for shapes means facets, aggregations, and IndexOrDocValues queries are currently not possible for Shape field types. This gap needs be closed in lucene.
> To support IndexOrDocValues queries along with various geometry aggregations and facets, the ability to compute the spatial relation with the doc value is needed. This is straightforward with {{XYPoint}} and {{LatLonPoint}} since the doc value encoding is nothing more than a simple 2D integer encoding of the x,y and lat,lon dimensional components. Accomplishing the same with a naive integer encoded binary representation for N-vertex shapes would be costly. 
> {{ComponentTree}} already provides an efficient in memory structure for quickly computing spatial relations over Shape types based on a binary tree of tessellated triangles provided by the {{Tessellator}}. Furthermore, this tessellation is already computed at index time. If we create an on-disk representation of {{ComponentTree}} 's binary tree of tessellated triangles and use this as the doc value {{binaryValue}} format we will be able to efficiently compute spatial relations with this binary representation and achieve the same facet/aggregation result over shapes as we can with points today (e.g., grid facets, centroid, area, etc).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org