You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2014/02/17 04:48:20 UTC

[jira] [Resolved] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues

     [ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley resolved LUCENE-5408.
----------------------------------

       Resolution: Fixed
    Fix Version/s: 5.0

> SerializedDVStrategy -- match geometries in DocValues
> -----------------------------------------------------
>
>                 Key: LUCENE-5408
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5408
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 5.0, 4.7
>
>         Attachments: LUCENE-5408_GeometryStrategy.patch, LUCENE-5408_SerializedDVStrategy.patch
>
>
> I've started work on a new SpatialStrategy implementation I'm tentatively calling SerializedDVStrategy.  It's similar to the [JtsGeoStrategy in Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] but a little different in the details -- certainly faster.  Using Spatial4j 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in internally WKB format) and the strategy will put it in a BinaryDocValuesField.  In practice the shape is likely a polygon but it needn't be.  Then I'll implement a Filter that returns a DocIdSetIterator that evaluates a given document passed via advance(docid)) to see if the query shape matches a shape in DocValues. It's improper usage for it to be used in a situation where it will evaluate every document id via nextDoc().  And in practice the DocValues format chosen should be a disk resident one since each value tends to be kind of big.
> This spatial strategy in and of itself has no _index_; it's O(N) where N is the number of documents that get passed thru it.  So it should be placed last in the query/filter tree so that the other queries limit the documents it needs to see.  At a minimum, another query/filter to use in conjunction is another SpatialStrategy like RecursivePrefixTreeStrategy.
> Eventually once the PrefixTree grid encoding has a little bit more metadata, it will be possible to further combine the grid & this strategy in such a way that many documents won't need to be checked against the serialized geometry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org