You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Nick Knize (Jira)" <ji...@apache.org> on 2022/08/11 21:09:00 UTC

[jira] [Comment Edited] (LUCENE-10654) New companion doc value format for LatLonShape and XYShape field types

    [ https://issues.apache.org/jira/browse/LUCENE-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578674#comment-17578674 ] 

Nick Knize edited comment on LUCENE-10654 at 8/11/22 9:08 PM:
--------------------------------------------------------------

Nightly test failure on XY bounding box:


{code:java}
Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox" -Ptests.jvms=4 -Ptests.haltonfailure=false -Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=ABDF070B81479950 -Ptests.multiplier=2 -Ptests.nightly=true -Ptests.badapples=false -Ptests.gui=true -Ptests.file.encoding=ISO-8859-1 -Ptests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-main/test-data/enwiki.random.lines.txt
{code}



{code:java}
1 tests failed.
FAILED:  org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox

Error Message:
java.lang.AssertionError: expected:<-2.028229934961692E32> but was:<-2.026382696309321E32>
{code}


This is caused because the {{TestUtil.nextPolygon}} is producing a polygon with an extruded colinear self intersecting vertex and the {{BaseXYShapeTestCase}} is not throwing this as an invalid polygon because the test case uses {{randomBoolean}}. The simple fix is to switch the TestCase to always throw an exception on invalid polygons so we never test with a non-compliant polygon. This passed the queries because the tessellator would filter out the dirty vertext. This test failed because the dirty vertext just happened to be the minimum X value. So this does expose an inconsistency where an invalid polygon will have a bounding box inconsistent with the raw geometry. I think that's okay because we have API guardrails to enable or disable strict validation and I don't think that should be removed.

I will open a PR to switch the base test cases over to strict geometry validation instead of random validation.



was (Author: nknize):
Nightly test failure on XY bounding box:


{code:java}
Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox" -Ptests.jvms=4 -Ptests.haltonfailure=false -Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=ABDF070B81479950 -Ptests.multiplier=2 -Ptests.nightly=true -Ptests.badapples=false -Ptests.gui=true -Ptests.file.encoding=ISO-8859-1 -Ptests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-main/test-data/enwiki.random.lines.txt
{code}



{code:java}
1 tests failed.
FAILED:  org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox

Error Message:
java.lang.AssertionError: expected:<-2.028229934961692E32> but was:<-2.026382696309321E32>
{code}


This is caused because the {{{TestUtil.nextPolygon}}} is producing a polygon with an extruded colinear self intersecting vertex and the {{BaseXYShapeTestCase}} is not throwing this as an invalid polygon because the test case uses {{randomBoolean}}. The simple fix is to switch the TestCase to always throw an exception on invalid polygons so we never test with a non-compliant polygon. This passed the queries because the tessellator would filter out the dirty vertext. This test failed because the dirty vertext just happened to be the minimum X value. So this does expose an inconsistency where an invalid polygon will have a bounding box inconsistent with the raw geometry. I think that's okay because we have API guardrails to enable or disable strict validation and I don't think that should be removed.

I will open a PR to switch the base test cases over to strict geometry validation instead of random validation.


> New companion doc value format for LatLonShape and XYShape field types
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-10654
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10654
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Nick Knize
>            Priority: Major
>             Fix For: 9.4
>
>          Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> {{XYDocValuesField}} provides doc value support for {{XYPoint}}. 
> {{LatLonDocValuesField}} provides docvalue support for {{LatLonPoint}}.
> However, neither {{LatLonShape}} nor {{XYShape}} currently have a docvalue format. 
> This lack of doc value support for shapes means facets, aggregations, and IndexOrDocValues queries are currently not possible for Shape field types. This gap needs be closed in lucene.
> To support IndexOrDocValues queries along with various geometry aggregations and facets, the ability to compute the spatial relation with the doc value is needed. This is straightforward with {{XYPoint}} and {{LatLonPoint}} since the doc value encoding is nothing more than a simple 2D integer encoding of the x,y and lat,lon dimensional components. Accomplishing the same with a naive integer encoded binary representation for N-vertex shapes would be costly. 
> {{ComponentTree}} already provides an efficient in memory structure for quickly computing spatial relations over Shape types based on a binary tree of tessellated triangles provided by the {{Tessellator}}. Furthermore, this tessellation is already computed at index time. If we create an on-disk representation of {{ComponentTree}} 's binary tree of tessellated triangles and use this as the doc value {{binaryValue}} format we will be able to efficiently compute spatial relations with this binary representation and achieve the same facet/aggregation result over shapes as we can with points today (e.g., grid facets, centroid, area, etc).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org