You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Javi Molina <ja...@gmail.com> on 2012/12/10 07:09:10 UTC

Intersect Circle is matching points way outside the radius ( Solr 4 Spatial)

Hi all,

I have been doing some development using Solr 4 Spatial recently and I
still can't grasp how to set up the spatial field/query properly to avoid
false matches.

I have read http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4several
times and from there I understand distErrPct is the main setting
that should control how accurate the match is.

The search scenario is as follows:

I am trying to match points (or bounding boxes) that intersect with a 50km
radius circle.

This is a particular example:  Circle(149.39999999999998 -34.92
d=0.44964028776978415)  but unfortunately it is matching a record with a
point that is more than 10Km outside the radius, that record has location
149.1244 -35.308056.


Visually this is the circle in red and the yellow point at the bottom is
the point that I am not expecting to be found (I hope you can see the
attached image)






Do you know what I need to do to tell solr to use a more accurate
resolution of the circle to avoid false matches. I think 20% the raidus of
the circle is not very helpful.

This is the relevant configuration.

schema.xml
I have changed distErrPct to 0.01, changing it to 0 gives me an
OutOfMemoryError when indexing.


    <fieldType name="location_rpt"
class="solr.SpatialRecursivePrefixTreeFieldType"
        geo="true" distErrPct="0.01" maxDistErr="0.000009" units="degrees"
/>


and the field location is just defined as

<field name="location"  type="location_rpt"  indexed="true" stored="true"
multiValued="true" />


The solr query is this

http://
[mysolrinstance]/solr/[myCore]/select?q=*%3A*&fq=location%3A"Intersects(Circle(149.39999999999998+-34.92+d%3D0.44964028776978415))+distErrPct%3D0.0"&wt=xml&indent=true&debugQuery=true

As you can see I have enven specified a value of 0.0 for distErrPct here
but it seems to make no difference

The resulting xml is as follows, there you can find the record with point
149.1244 -35.308056 I refered to previously.



<response><lst name="responseHeader"><int name="status">0</int><int
name="QTime">19</int><lst name="params"><str name="indent">true</str><str
name="wt">xml</str><str name="debugQuery">true</str><str
name="q">*:*</str><str
name="fq">location:"Intersects(Circle(149.39999999999998 -34.92
d=0.44964028776978415)) distErrPct=0.0"</str></lst></lst><result
name="response" numFound="2" start="0"><doc><str
name="dataAccess">PUBLIC</str><int name="licenceId">1021</int><str
name="anzlicTopic">Boundaries</str><str
name="pointOfContactPosition">sdfasgsdf</str><str
name="activityAbstract">Acquired Data</str><str
name="id">117379v004</str><str name="metadataAccess">PUBLIC</str><str
name="name">Sat Try</str><str name="sortableName">Sat Try</str><str
name="description">a</str><date
name="publishDate">2012-05-28T21:45:15Z</date><date
name="depositDate">2011-09-07T06:22:58Z</date><str
name="licenceName">Creative Commons Attribution-ShareAlike
Licence</str><str name="activityTypeName">Case Study</str><str
name="projectDescription">Acquired</str><str
name="projectName">Acquired</str><str
name="fedoraId">csiro:117379</str><int name="dcid">2514143</int><str
name="activityTitle">Acquired Data</str><bool
name="isSoftwareCollection">false</bool><str name="keywords">a</str><arr
name="location"><str>141.465484619141 -36 149.8 -30</str></arr><str
name="leadResearcherName">James Dempsey</str><str
name="pointOfContactName">James Dempsey</str><str
name="rightsStatement">All rights (including copyright) CSIRO Australia
2011.</str><arr name="topic"><str>Electrical and Electromagnetic Methods in
Geophysics</str><str>Agricultural Land Management</str></arr><arr
name="topicName"><str>Electrical and Electromagnetic Methods in
Geophysics</str><str>Agricultural Land Management</str></arr><arr
name="topicDivisionName"><str>Earth Sciences</str><str>Agricultural and
Veterinary Sciences</str></arr><arr name="topicDivision"><str>Earth
Sciences</str><str>Agricultural and Veterinary Sciences</str></arr><arr
name="personName"><str>Dempsey, James</str></arr><arr
name="person"><str>Dempsey, James</str></arr><long
name="_version_">1420941080504827904</long><date
name="timestamp">2012-12-10T04:50:13.605Z</date></doc><doc><str
name="dataAccess">PUBLIC</str><int name="licenceId">1101</int><str
name="anzlicTopic">Location</str><str name="pointOfContactPosition">Poc
position</str><str name="activityAbstract">Acquired Data</str><str
name="id">101052v002</str><str name="metadataAccess">PUBLIC</str><str
name="name">DEL134 20110908 ZZ</str><str name="sortableName">DEL134
20110908 ZZ</str><str name="description">desc</str><date
name="publishDate">2012-10-25T04:42:10Z</date><date
name="depositDate">2011-09-01T07:17:41Z</date><str name="licenceName">CSIRO
Data Licence</str><str name="activityTypeName">Case Study</str><str
name="projectDescription">Acquired</str><str
name="projectName">Acquired</str><str
name="fedoraId">csiro:101052</str><int name="dcid">2534043</int><str
name="activityTitle">Acquired Data</str><bool
name="isSoftwareCollection">false</bool><str name="keywords">anzlic
added</str><arr name="location"><str>149.1244 -35.308056</str></arr><str
name="leadResearcherName">James Dempsey</str><str
name="pointOfContactName">Poc name</str><str name="rightsStatement">All
Rights Reserved (including copyright) CSIRO Australia 2011.</str><arr
name="topic"><str>Environmental Monitoring</str></arr><arr
name="topicName"><str>Environmental Monitoring</str></arr><arr
name="topicDivisionName"><str>Environmental Sciences</str></arr><arr
name="topicDivision"><str>Environmental Sciences</str></arr><arr
name="personName"><str>Dempsey, James</str></arr><arr
name="person"><str>Dempsey, James</str></arr><long
name="_version_">1420941081560743936</long><date
name="timestamp">2012-12-10T04:50:14.612Z</date></doc></result><lst
name="debug"><str name="rawquerystring">*:*</str><str
name="querystring">*:*</str><str
name="parsedquery">MatchAllDocsQuery(*:*)</str><str
name="parsedquery_toString">*:*</str><lst name="explain"><str
name="117379v004">
1.0 = (MATCH) MatchAllDocsQuery, product of:
  1.0 = queryNorm
</str><str name="101052v002">
1.0 = (MATCH) MatchAllDocsQuery, product of:
  1.0 = queryNorm
</str></lst><str name="QParser">LuceneQParser</str><arr
name="filter_queries"><str>location:"Intersects(Circle(149.39999999999998
-34.92 d=0.44964028776978415)) distErrPct=0.0"</str></arr><arr
name="parsed_filter_queries"><str>ConstantScore(RecursivePrefixTreeFilter{fieldName='location',
shape=Circle(Pt(x=149.39999999999998,y=-34.92), d=0.4°
50.00km)})</str></arr><lst name="timing"><double
name="time">19.0</double><lst name="prepare"><double
name="time">0.0</double><lst
name="org.apache.solr.handler.component.QueryComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.FacetComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.MoreLikeThisComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.HighlightComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.StatsComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.DebugComponent"><double
name="time">0.0</double></lst></lst><lst name="process"><double
name="time">19.0</double><lst
name="org.apache.solr.handler.component.QueryComponent"><double
name="time">19.0</double></lst><lst
name="org.apache.solr.handler.component.FacetComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.MoreLikeThisComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.HighlightComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.StatsComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.DebugComponent"><double
name="time">0.0</double></lst></lst></lst></lst></response>


Can anyone shed some light please on what I am doing wrong?


Probably I can live with an accuracy of 500m for a 50km radius circle but
more than 10km seems to me that something is not right.

Thanks,
Javier

Re: Intersect Circle is matching points way outside the radius ( Solr 4 Spatial)

Posted by Javi Molina <ja...@gmail.com>.
Hi David,

Your latest response was lost in my inbox, I just realised it was there.


You are right, I am using Open Layers, and even though I use the mercator
projection, there are elements that not adhere to that projection, in
particular the polygon that generates the circle and the scale control.

Precisely as you mention a circle drawn using the mrcator projection will
look more enlarged when drawn close to the poles than it really is (just as
Greenland)

I mentioned multiple times to my team, the complexity was not in learning
Open Layers but in grasping the concept of projections properly.


Javier

On 11 December 2012 16:28, David Smiley (@MITRE.org) <DS...@mitre.org>wrote:

> Javier,
>
> I want to expand upon what I said; you might already get this point but
> others may come along and read this and might not.
>
> Naturally you are using a 2D map as most applications do (Google Earth is
> the stand-out exception), and fundamentally this means the map is projected
> -- it has to be.  There isn't a "right" (correct) projection, generally
> speaking.  Most/all web based map APIs are strictly "web mercator".  If you
> have a map GUI selection tool in which a circle is drawn, a perfect looking
> round circle, then it's a lie unless you're looking directly at the
> equator.
> If the intent is for the user to draw a distance based circle, then ideally
> your map tool should draw an elliptical looking circle if it's to be
> accurate.  This is why you got confused; you saw a circle yet the point
> wasn't drawn in the circle because that circle *should have been* stretched
> vertically to barely pass it.  If on the other hand you intend for the
> query
> shape to be exactly what it displays to be (what appears to be a perfect
> circle), even though this means the true geodetic shape is not a perfect
> circle, then you could use geo="false" (and configure some other
> attributes)
> such that you are using standard planar math, not geodetic.  Then your
> query
> shape would appear to work correctly but IMO its misleading over the first
> option (draw an ellipse, not a circle).  The circle misleads the user; it
> mislead you.
>
> ~ David
>
>
> Javier Molina wrote
> > Hi David,
> >
> > As it happens the points are using the right projection, I can see them
> in
> > the same position using the page you just provided.
> >
> > There is something wrong with the radius of the circle though I need to
> > investigate that but it is a relief to know that there is nothing wrong
> > with Solr and that I didn't mix the concepts, it is just as in many cases
> > the problem is somewhere else where you would never imagine.
> >
> > Thanks for the hint.
> >
> > Cheers,
> > Javier
> >
> >
> >
> >
> >
> > On 11 December 2012 02:47, David Smiley (@MITRE.org) &lt;
>
> > DSMILEY@
>
> > &gt;wrote:
> >
> >> Javi,
> >>   The center point of your query circle and the indexed point is just
> >> under
> >> 49.9km (just under your query radius); this is why it matched.  I
> plugged
> >> in
> >> your numbers here:
> >> http://www.movable-type.co.uk/scripts/latlong.html
> >> Perhaps you are misled by the projection you are using to view the map,
> >> on
> >> how far away the points are.
> >>
> >> FYI The default distErrPct of 0.025 should be fine in general and wasn't
> >> the
> >> issue.  You should (almost) never use 0.0 on the field type because that
> >> means your indexed non-point shapes (rectangles you said) will use a ton
> >> of
> >> indexed terms unless they are very small rectangles (relative to your
> >> grid
> >> resolution -- 1 meter in your case).  Using distErrPct=0 in the query is
> >> safe, on the other hand.
> >>
> >> Cheers,
> >>   David
> >>
> >>
> >>
> >> -----
> >>  Author:
> >> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Intersect-Circle-is-matching-points-way-outside-the-radius-Solr-4-Spatial-tp4025609p4025704.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
>
>
>
>
>
> -----
>  Author:
> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Intersect-Circle-is-matching-points-way-outside-the-radius-Solr-4-Spatial-tp4025609p4025924.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
>

Re: Intersect Circle is matching points way outside the radius ( Solr 4 Spatial)

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Javier,

I want to expand upon what I said; you might already get this point but
others may come along and read this and might not.

Naturally you are using a 2D map as most applications do (Google Earth is
the stand-out exception), and fundamentally this means the map is projected
-- it has to be.  There isn't a "right" (correct) projection, generally
speaking.  Most/all web based map APIs are strictly "web mercator".  If you
have a map GUI selection tool in which a circle is drawn, a perfect looking
round circle, then it's a lie unless you're looking directly at the equator. 
If the intent is for the user to draw a distance based circle, then ideally
your map tool should draw an elliptical looking circle if it's to be
accurate.  This is why you got confused; you saw a circle yet the point
wasn't drawn in the circle because that circle *should have been* stretched
vertically to barely pass it.  If on the other hand you intend for the query
shape to be exactly what it displays to be (what appears to be a perfect
circle), even though this means the true geodetic shape is not a perfect
circle, then you could use geo="false" (and configure some other attributes)
such that you are using standard planar math, not geodetic.  Then your query
shape would appear to work correctly but IMO its misleading over the first
option (draw an ellipse, not a circle).  The circle misleads the user; it
mislead you.

~ David


Javier Molina wrote
> Hi David,
> 
> As it happens the points are using the right projection, I can see them in
> the same position using the page you just provided.
> 
> There is something wrong with the radius of the circle though I need to
> investigate that but it is a relief to know that there is nothing wrong
> with Solr and that I didn't mix the concepts, it is just as in many cases
> the problem is somewhere else where you would never imagine.
> 
> Thanks for the hint.
> 
> Cheers,
> Javier
> 
> 
> 
> 
> 
> On 11 December 2012 02:47, David Smiley (@MITRE.org) &lt;

> DSMILEY@

> &gt;wrote:
> 
>> Javi,
>>   The center point of your query circle and the indexed point is just
>> under
>> 49.9km (just under your query radius); this is why it matched.  I plugged
>> in
>> your numbers here:
>> http://www.movable-type.co.uk/scripts/latlong.html
>> Perhaps you are misled by the projection you are using to view the map,
>> on
>> how far away the points are.
>>
>> FYI The default distErrPct of 0.025 should be fine in general and wasn't
>> the
>> issue.  You should (almost) never use 0.0 on the field type because that
>> means your indexed non-point shapes (rectangles you said) will use a ton
>> of
>> indexed terms unless they are very small rectangles (relative to your
>> grid
>> resolution -- 1 meter in your case).  Using distErrPct=0 in the query is
>> safe, on the other hand.
>>
>> Cheers,
>>   David
>>
>>
>>
>> -----
>>  Author:
>> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Intersect-Circle-is-matching-points-way-outside-the-radius-Solr-4-Spatial-tp4025609p4025704.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>





-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/Intersect-Circle-is-matching-points-way-outside-the-radius-Solr-4-Spatial-tp4025609p4025924.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Intersect Circle is matching points way outside the radius ( Solr 4 Spatial)

Posted by Javier Molina <ja...@acsmail.net.au>.
Hi David,

As it happens the points are using the right projection, I can see them in
the same position using the page you just provided.

There is something wrong with the radius of the circle though I need to
investigate that but it is a relief to know that there is nothing wrong
with Solr and that I didn't mix the concepts, it is just as in many cases
the problem is somewhere else where you would never imagine.

Thanks for the hint.

Cheers,
Javier





On 11 December 2012 02:47, David Smiley (@MITRE.org) <DS...@mitre.org>wrote:

> Javi,
>   The center point of your query circle and the indexed point is just under
> 49.9km (just under your query radius); this is why it matched.  I plugged
> in
> your numbers here:
> http://www.movable-type.co.uk/scripts/latlong.html
> Perhaps you are misled by the projection you are using to view the map, on
> how far away the points are.
>
> FYI The default distErrPct of 0.025 should be fine in general and wasn't
> the
> issue.  You should (almost) never use 0.0 on the field type because that
> means your indexed non-point shapes (rectangles you said) will use a ton of
> indexed terms unless they are very small rectangles (relative to your grid
> resolution -- 1 meter in your case).  Using distErrPct=0 in the query is
> safe, on the other hand.
>
> Cheers,
>   David
>
>
>
> -----
>  Author:
> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Intersect-Circle-is-matching-points-way-outside-the-radius-Solr-4-Spatial-tp4025609p4025704.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Intersect Circle is matching points way outside the radius ( Solr 4 Spatial)

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Javi,
  The center point of your query circle and the indexed point is just under
49.9km (just under your query radius); this is why it matched.  I plugged in
your numbers here:
http://www.movable-type.co.uk/scripts/latlong.html
Perhaps you are misled by the projection you are using to view the map, on
how far away the points are.

FYI The default distErrPct of 0.025 should be fine in general and wasn't the
issue.  You should (almost) never use 0.0 on the field type because that
means your indexed non-point shapes (rectangles you said) will use a ton of
indexed terms unless they are very small rectangles (relative to your grid
resolution -- 1 meter in your case).  Using distErrPct=0 in the query is
safe, on the other hand.

Cheers,
  David



-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/Intersect-Circle-is-matching-points-way-outside-the-radius-Solr-4-Spatial-tp4025609p4025704.html
Sent from the Solr - User mailing list archive at Nabble.com.