You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2013/06/14 01:36:20 UTC

[jira] [Updated] (LUCENE-5056) Indexing non-point shapes close to the poles doesn't scale

     [ https://issues.apache.org/jira/browse/LUCENE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley updated LUCENE-5056:
---------------------------------

    Attachment: indexed circle close to the pole.png

The attached pic shows the circle of the same size (some ~5km radius) at 88 degrees latitude.  It generated a whopping 7888 cells; and it gets worse closer to the pole.  Technically this approach works but it clearly doesn't scale at the poles.

I'm gonna have to think about this one for a bit.  I think a fix requires a new SpatialPrefixTree encoding that divides the world differently at the poles.  Solving this is arguably a new requirement for LUCENE-4922.
                
> Indexing non-point shapes close to the poles doesn't scale
> ----------------------------------------------------------
>
>                 Key: LUCENE-5056
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5056
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/spatial
>    Affects Versions: 4.3
>            Reporter: David Smiley
>         Attachments: indexed circle close to the pole.png
>
>
> From: [~hdeadman]
> We are seeing an issue where certain shapes are causing Solr to use up all available heap space when a record with one of those shapes is indexed. We were indexing polygons where we had the points going clockwise instead of counter-clockwise and the shape would be so large that we would run out of memory. We fixed those shapes but we are seeing this circle eat up about 700MB of memory before we get an OutOfMemory error (heap space) with a 1GB JVM heap.
> Circle(3.0 90 d=0.0499542757922153)
> Google Earth can't plot that circle either, maybe it is invalid or too close to the north pole due to the latitude of 90, but it would be nice if there was a way for shapes to be validated before they cause an OOM error.
> The objects (4.5 million) are all GeohashPrefixTree$GhCell objects in an ArrayList owned by PrefixTreeStrategy$CellTokenStream.
> Is there anyway to have a max number of cells in a shape before it is considered too large and is not indexed? Is there a geo library that could validate the shape as being reasonably sized and bounded before it is processed?
> We are currently using Solr 4.1.
> <fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType"
> spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
> geo="true" distErrPct="0.025" maxDistErr="0.000009" units="degrees" />

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org