You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by SpacemanSteve <Sp...@gmail.com> on 2011/06/24 15:14:27 UTC

intersecting map extent with solr spatial documents

The following describes a Solr query filter that determines if axis aligned
geographic bounding boxes intersect.  It is used to determine which
documents in a Solr repository containing spatial data are relevant to a
given rectangular geographic region that corresponds to a displayed map.  I
haven’t seen this described before.  I thought it might be useful to others
and I might get some pointers on how to improve it.

OpenGeoPortal (http://geoportal-demo.atech.tufts.edu/) is a web application
supporting the rapid discovery of GIS layers.  It uses Solr to combine
spatial, keyword, date and GIS datatype based searching.  As a user
manipulates its map OpenGeoPortal automatically computes and displays
relevant search results.  This spatial searching requires the application to
determine which GIS layers are relevant given the current extent of the map. 
Each Solr document includes spatial information about a single GIS layer. 
Specifically, it contains the center of the layer (in degrees latitude and
longitude stored as tdoubles) as well as the height and width of the layer
(in degrees stored as a tdouble).  These values are precomputed from the
bounding boxes of the layers during ingest.

To identify relevant layers our search algorithm looks for a separating axis
(http://en.wikipedia.org/wiki/Separating_axis_theorem) between the current
bounds of the map and the bounds of each layer.  If a horizontal or vertical
separating axis exists then the layer does not contain any information in
the geographic area defined by the map.  If neither separating axis exists
then the layer intersects the map, and is included in the result set.  

Identifying whether separating axes exists is relatively straightforward
given two axis-aligned bounding boxes.  In our case, one bounding box is
defined by the map’s current extent and the other bounding box by a GIS
layer.  To determine if a vertical separating exists one must determine if
the difference between the center longitude of the map and the center
longitude of the layer is greater then the sum of the width of the map with
the width of the layer.  If so, a vertical separating axis exists.  If not,
a vertical separating axis does not exist.  (See 
http://www.gamasutra.com/view/feature/3383/simple_intersection_tests_for_games.php?page=3
for a diagram.) Similarly, the presence of a horizontal separating can be
computed using center latitudes and heights.  

It is possible to generate a Solr filter query that filters layers that
contain neither a horizontal or vertical separating axis given a specific
map.  Naturally, this query is somewhat complicated.  The query essentially
counts the number of separating axes and, using !frange, eliminates layers
that have a separating axis.  In the following example, the map was centered
on latitude 42.3, longitude -71.0 and had a width and height of 0.3 degrees. 
The schema defines the fields CenterX, CenterY, HalfWidth and HalfHeight.

fq={!frange l=1  u=2}
map(sum(
map(sub(abs(sub(-71.0,CenterX)),sum(0.3,HalfWidth)),0,360,1,0),
map(sub(abs(sub(42.3,CenterY)),sum(0.3,HalfHeight)),0,90,1,0)),
0,0,1,0)

The clauses that check for bounding box (e.g.,
sub(abs(sub(-71.0,CenterX)),sum(0.3,HalfWidth)) and
sub(abs(sub(42.3,CenterY)),sum(0.3,HalfHeight))) return a positive number if
a separating axis exists.  Using a map function, this value is mapped to 1
if the separating axis exists and 0 if it does not.  The Solr query checks
for two separating axes and computes the number of such axes using sum.  The
total number of separating axes (which is 0, 1 or 2) is then mapped to the
values 0 and 1.  This final map returns 1 if there are no separating axes
(that is, the bounding boxes intersect) or 0 if there is at least one
separating axis (that is, the bounding boxes do not intersect).  The
outermost clause applies a frange function to eliminate those layers that do
not intersect the current map.  

Ranking the layers that intersect the map is a separate issue.  This is done
with several query clauses.  One clause determines how the area of the map
compares area of the layer.  The other determines how the center of the map
compares to the center of the layer.  These clauses are used in conjunction
keyword-based queries and date-based filters to create search results based
on spatial, keyword and temporal constraints.

--
View this message in context: http://lucene.472066.n3.nabble.com/intersecting-map-extent-with-solr-spatial-documents-tp3104098p3104098.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: intersecting map extent with solr spatial documents

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Steve,
 I like your geospatial boosting algorithm; it makes sense.

FYI I'm collaborating with a couple other Lucene/Solr committers on a new
geospatial module here:  http://code.google.com/p/lucene-spatial-playground/ 
It is very much in-progress.  There is variable-points per document support
(based on geohashes or other prefix grids) and I'm nearly done with polygon
indexing -- just have some testing to do which will surely find bugs at this
early stage. At some point I want to incorporate your box shape
function-query based technique into what we call a "SpatialStrategy" and
benchmark it.  I have other priorities first.

~ David Smiley

-----
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/intersecting-map-extent-with-solr-spatial-documents-tp3104098p3114922.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: intersecting map extent with solr spatial documents

Posted by SpacemanSteve <Sp...@gmail.com>.
    David, thanks for the kind words.  Your Solr book is terrific.  Without
it, my project would not have succeeded.  I am in your debt.

    The above spatial query filter eliminates GIS layers that do not
intersect a given map.  I attempt to rank the intersecting layers according
to how similar they are to the map.  The concept is the user can manipulate
the map to find GIS layers for a region.  If the user zooms in to a city,
they most likely want some local data for the city.  If the user zooms out
so that map displays an entire continent, the user probably isn’t very
interested in city street data.  Zooming the map out to display a continent
tells the application to increase the score for continent scale data.

    The spatial query has four components.  Three of the components simply
compare the area, center latitude and center longitude of the map to the
layer.  The closer properties of the layer are to the properties of the map,
the higher their score.  I use a recip function to compute these three
scores.  Scoring based on area is weighted the most with a boost of 15.  The
boost for matching center latitude and center longitude is 3 for each.  

The fourth spatial query component boosts layers that are completely
contained within the map extents.  It counts how many of the four corners of
the layer are contained within the map.  If the number of contained corners
equals 4, the layer is given a boost of 10.  Like the filter query, the
“contained within” query computes a value and then maps it to between 0 and
1 to essentially implement an “if” statement.

The full query includes these four spatial components and keyword based
clauses.  The scoring is set up so that the spatial components dominate. 
Currently, the keyword based query terms have a boost of 2 or 3.  Additional
filter queries can limit search results based on the year the layer was
created or its data type (raster, vector, etc.).  

If anybody wants to see the code, there’s about 800 lines of JavaScript at
http://code.google.com/p/opengeoportal/source/browse/WebContent/javascript/solr.js. 
It provides a JavaScript object you can instantiate, call set functions like
setBoundingBox, setDates, and then call sendToSolr to execute a search via
JSONP.

Rather then a single bounding box per GIS layer I would like to support
multiple shapes.  The bounding box works pretty well, there aren’t a lot of
diagonal states or countries.  However, if I could break down a single
bounding box into something more detailed, search would be better.  I
haven’t figured out how to do that yet!  The meta data associated with each
layer doesn’t provide anything more detailed then an axis aligned bounding
box.  

I’ll look into FieldCaches, that sounds like something I should know about. 
In my application I can run relatively expensive queries because I don’t
have all that much data to search.  Currently I have around 9k GIS data
layers.  Maybe that will reach 100k, but probably not even that.  



--
View this message in context: http://lucene.472066.n3.nabble.com/intersecting-map-extent-with-solr-spatial-documents-tp3104098p3108219.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: intersecting map extent with solr spatial documents

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Very cool!

What you've essentially described is a way of indexing & searching lat-lon
box shapes, and the cool thing is that you were able to do this without
custom coding / hacking of Solr.  Sweet!  I do have some observations about
this approach:
1. Doesn't support variable number of shapes per document. (LatLonType
doesn't either, by the way)
2. The use of function queries on CenterX, CenterY, HalfWidth, and
HalfHeight means that all these values (just the distinct ones) will be put
into RAM in Lucene's FieldCache.  Not a big deal but something to be noted.
3. The function query is going to be evaluated on every document matching
the keyword search.  That will probably perform okay; not so sure for large
indexes with a *:* query.

Again, nice job.  Could you please share an example of your ranking query?

~ David Smiley

-----
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/intersecting-map-extent-with-solr-spatial-documents-tp3104098p3106333.html
Sent from the Solr - User mailing list archive at Nabble.com.