You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bojan Šmid <bo...@gmail.com> on 2014/02/03 14:00:26 UTC

Re: Geospatial clustering + zoom in/out help

Hi David,

  I was hoping to get an answer on Geospatial topic from you :). These
links basically confirm that approach I wanted to take should work ok with
similar (or even bigger) amount of data than I plan to have. Instead of my
custom NxM division of world, I'll try existing GeoHash encoding, it may be
good enough (and will be quicker to implement).

  Thanks!

  Bojan


On Fri, Jan 31, 2014 at 8:27 PM, Smiley, David W. <ds...@mitre.org> wrote:

> Hi Bojan.
>
> You've got some good ideas here along the lines of some that others have
> tried.  I've through together a page on the wiki about this subject some
> time ago that I'm sure you will find interesting.  It references a relevant
> stack-overflow post, and also a presentation at DrupalCon which had a
> segment from a guy using the same approach you suggest here involving
> field-collapsing and/or stats components.  The video shows it in action.
>
> http://wiki.apache.org/solr/SpatialClustering
>
> It would be helpful for everyone if you share your experience with
> whatever you choose, once you give an approach a try.
>
> ~ David
> ________________________________________
> From: Bojan Šmid [bosmid@gmail.com]
> Sent: Thursday, January 30, 2014 1:15 PM
> To: solr-user@lucene.apache.org
> Subject: Geospatial clustering + zoom in/out help
>
> Hi,
>
> I have an index with 300K docs with lat,lon. I need to cluster the docs
> based on lat,lon for display in the UI. The user then needs to be able to
> click on any cluster and zoom in (up to 11 levels deep).
>
> I'm using Solr 4.6 and I'm wondering how best to implement this
> efficiently?
>
> A bit more specific questions below.
>
> I need to:
>
> 1) cluster data points at different zoom levels
>
> 2) click on a specific cluster and zoom in
>
> 3) be able to select a region (bounding box or polygon) and show clusters
> in the selected area
>
> What's the best way to implement this so that queries are fast?
>
> What I thought I would try, but maybe there are better ways:
>
> * divide the world in NxM large squares and then each of these squares into
> 4 more squares, and so on - 11 levels deep
>
> * at index time figure out all squares (at all 11 levels) each data point
> belongs to and index that info into 11 different fields: e.g.
> <id=1 name=foo lat=x lon=y zoom1=square1_62  zoom2=square1_62_47
> zoom3=square1_62_47_33 ....>
>
> * at search time, use field collapsing on zoomX field to get which docs
> belong to which square on particular level
>
> * calculate center point of each square (by calculating mean value of
> positions for all points in that square) using StatsComponent (facet on
> zoomX field, avg on lat and lon fields) - I would consider those squares as
> separate clusters (one square is one cluster) and center points of those
> squares as center points of clusters derived from them
>
> I *think* the problem with this approach is that:
>
> * there will be many unique fields for bigger zoom levels, which means
> field collapsing / StatsComponent maaay not work fast enough
>
> * clusters will not look very natural because I would have many clusters on
> each zoom level and what are "real" geographical clusters would be
> displayed as multiple clusters since their points would in some cases be
> dispersed into multiple squares. But that may be OK
>
> * a lot will depend on how the squares are calculated - linearly dividing
> 360 degrees by N to get "equal" size squares in degrees would produce
> issues with "real" square sizes and counts of points in each of them
>
>
> So I'm wondering if there is a better way?
>
> Thanks,
>
>
>   Bojan
>