You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/09/13 10:28:20 UTC

[GitHub] [lucene] iverase opened a new issue, #11767: Does the method #cureLocalIntersections on the Tessellator makes any sense?

iverase opened a new issue, #11767:
URL: https://github.com/apache/lucene/issues/11767

   ### Description
   
   I have always been confused by this method which claims to fix self-intersections of the polygon but we require in the Tessellator java docs that polygons should not have self-intersections. More over, we have no test that needs this method to pass so it is currently untested / unnecessary. 
   
   I am bringing this up because I have an example of a polygon that it seems to spend half of the time on that method. It is a huge polygon that with this method takes around 560 seconds to tessellate but 260 seconds if we don't go through this method.
   
   My proposal is to remove the method completely or at least not call this method if the Tessellator has been called with the flag `checkSelfIntersections` set to true.
   
   @nknize introduced this method on the first version of the Tessellator, he might have more background about the need of this method. what do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] llermaly commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?

Posted by GitBox <gi...@apache.org>.
llermaly commented on issue #11767:
URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248415566

   Here I have some valid polygons being rejected for self intersecting, in case are useful for you to test: 
   
   https://github.com/apache/lucene/issues/11776


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] nknize commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?

Posted by GitBox <gi...@apache.org>.
nknize commented on issue #11767:
URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248382179

   > My proposal is to remove the method completely or at least not call this method if the Tessellator has been called with the flag `checkSelfIntersections` set to true.
   > 
   > @nknize introduced this method on the first version of the Tessellator, he might have more background about the need of this method. what do you think?
   
   I was actually looking into this before turning my attention to the shape doc values and I was just getting ready to come back to it. The method was originally introduced to postpone self intersection removal unless absolutely necessary (e.g., tessellation failed). Essentially it was a lazy cleaning approach. I believe, though this needs thorough evaluation, some of the improvements made to [filter points](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/geo/Tessellator.java#L1326) rendered much of this logic obsolete. My last test (random and explicit) never actually exercised this logic, even with self intersections. Furthermore, the follow up SPLIT logic was not exercised either so I was exploring the possibility of removing both of these phases.  
   
   > if you could go to https://github.com/apache/lucene/issues/11777 and test with those polygons as well. We are having Elastic Cloud timeouts because of the time it takes to triangulate.
   
   These adversarial cases are important to capture in our tests. Our randomized polygon generator doesn't inject any self intersections so we really have a gap in testing the logic coverage. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] llermaly commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?

Posted by GitBox <gi...@apache.org>.
llermaly commented on issue #11767:
URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248367379

   Hi @iverase would be nice if you could go to https://github.com/apache/lucene/issues/11777 and test with those polygons as well. We are having Elastic Cloud timeouts because of the time it takes to triangulate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] iverase commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?

Posted by GitBox <gi...@apache.org>.
iverase commented on issue #11767:
URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248975086

   >The method was originally introduced to postpone self intersection removal 
   
   I don't understand this. We re claiming in the java docs that polygons should not be self-intersecting and we do not introduce self-intersections in our code, why we want to remove them?
   
   ```
    * <ul>
    *   <li>Requires valid polygons:
    *       <ul>
    *         <li>No self intersections
    *         <li>Holes may only touch at one vertex
    *         <li>Polygon must have an area (e.g., no "line" boxes)
    *         <li>sensitive to overflow (e.g, subatomic values such as E-200 can cause unexpected
    *             behavior)
    *       </ul>
    * </ul>
   ```
   Looking at the original code which the tessellator is inspired on, the method was introduced to handle some OSM polygons that contain self-intersections, hence not valid: https://github.com/mapbox/earcut/issues/8
   As we claim we only support valid polygons, I think it is safe to remove the method entirely.
   
   @llermaly I found this issue by looking into one of your polygons so we should expect nice performance improvements.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] iverase commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?

Posted by GitBox <gi...@apache.org>.
iverase commented on issue #11767:
URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248999452

   Then let's just skip that step if the tessellator is called with the flag `checkSelfIntersections` set to true. In that case it will fail before getting to this place.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] nknize commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?

Posted by GitBox <gi...@apache.org>.
nknize commented on issue #11767:
URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248986666

   > I don't understand this. We re claiming in the java docs that polygons should not be self-intersecting and we do not introduce self-intersections in our code, why we want to remove them?
   
   
   Because real life Geo data doesn't care what our javadocs say. Small self intersections are a reality that rears its head every now and then in real data and the performance hit to "best effort" detect and clean in the tessellator's cure phase at index time was worth more than directing user's to a third party cleaning utility before indexing. 
   
   Our blind polygon class does nothing to enforce those javadocs so maybe before completely removing we might consider flagging that phase as an optional validation step disabled by default? (Think `ignore_malformed`) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org