You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2014/07/09 04:35:05 UTC

[jira] [Resolved] (LUCENE-5779) Improve BBox AreaSimilarity algorithm to consider lines and points

     [ https://issues.apache.org/jira/browse/LUCENE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley resolved LUCENE-5779.
----------------------------------

       Resolution: Fixed
    Fix Version/s: 4.10
                   5.0
         Assignee: David Smiley

> Improve BBox AreaSimilarity algorithm to consider lines and points
> ------------------------------------------------------------------
>
>                 Key: LUCENE-5779
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5779
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 5.0, 4.10
>
>         Attachments: LUCENE-5779__Improved_bbox_AreaSimilarity_algorithm.patch
>
>
> GeoPortal's area overlap algorithm didn't consider lines and points; they end up turning the score 0.  I've thought about this for a bit and I've come up with an alternative scoring algorithm.  (already coded and tested and documented):
> New Javadocs:
> {code:java}
> /**
>  * The algorithm is implemented as envelope on envelope overlays rather than
>  * complex polygon on complex polygon overlays.
>  * <p/>
>  * <p/>
>  * Spatial relevance scoring algorithm:
>  * <DL>
>  *   <DT>queryArea</DT> <DD>the area of the input query envelope</DD>
>  *   <DT>targetArea</DT> <DD>the area of the target envelope (per Lucene document)</DD>
>  *   <DT>intersectionArea</DT> <DD>the area of the intersection between the query and target envelopes</DD>
>  *   <DT>queryTargetProportion</DT> <DD>A 0-1 factor that divides the score proportion between query and target.
>  *   0.5 is evenly.</DD>
>  *
>  *   <DT>queryRatio</DT> <DD>intersectionArea / queryArea; (see note)</DD>
>  *   <DT>targetRatio</DT> <DD>intersectionArea / targetArea; (see note)</DD>
>  *   <DT>queryFactor</DT> <DD>queryRatio * queryTargetProportion;</DD>
>  *   <DT>targetFactor</DT> <DD>targetRatio * (1 - queryTargetProportion);</DD>
>  *   <DT>score</DT> <DD>queryFactor + targetFactor;</DD>
>  * </DL>
>  * Note: The actual computation of queryRatio and targetRatio is more complicated so that it considers
>  * points and lines. Lines have the ratio of overlap, and points are either 1.0 or 0.0 depending on wether
>  * it intersects or not.
>  * <p />
>  * Based on Geoportal's
>  * <a href="http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java">
>  *   SpatialRankingValueSource</a> but modified. GeoPortal's algorithm will yield a score of 0
>  * if either a line or point is compared, and it's doesn't output a 0-1 normalized score (it multiplies the factors).
>  *
>  * @lucene.experimental
>  */
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org