You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by blopez <ba...@hotmail.com> on 2012/11/15 10:50:52 UTC

PointType multivalued query

Hi all,

I'm using a multivalued PointType (6 dimensions) in my Solr schema. Imagine
that I have one doc indexed in Solr:

<doc>
      <field name="docId">-1</field>
      <field name="point">1,1,1,1,1,1</field>
      <field name="point">5,5,5,5,5,5</field>
</doc>

Now imagine that I launch some queries:

point:[0,0,0,0,0,0 TO 2,2,2,2,2,2]: Works OK (matches with the first doc
point and returns doc -1)
point:[4,4,4,4,4,4 TO 6,6,6,6,6,6]: Works OK (matches with the second doc
point and returns doc -1)

point:[4,0,0,0,0,0 TO 6,2,2,2,2,2]: Does not work. The first query point
matches with the second doc point, and the rest of query points matches with
the first doc point (returns doc -1, but it must NOT return any doc!). I
only want to retrieve docs which have a point that completely matches with
the query point.

I don't know if my problem is the PointType data type or bad behavior of the
multivalued items. What do you think about that?

Regards,
Borja.




--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: PointType multivalued query

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Sorry, you're out of luck.  SRPT could be generalized but that's a bit of
work.  The trickiest part I think would be writing a multi-dimensional
SpatialPrefixTree impl.

If the # of discrete values at each dimension is pretty small (<100? ish?),
then there is a way using term positions and span queries (I think).  In
you're example you're using 1, 2, ... up to 6.  But I figure that's to keep
your example simple and isn't reflective of your actual data.

~ David



-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020625.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: PointType multivalued query

Posted by blopez <ba...@hotmail.com>.
Sorry I tried to explain it too fast.

Imagine the usecase that I wrote on the first post.

A document can have more than one 6-Dimensions point. So my first approach
was:
<doc>
   <field name="pk">1</field>
   <field name="docId>10</field>
   <field name="point>2,2,2,2,2,2</field>
</doc>
<doc>
   <field name="pk">2</field>
   <field name="docId>10</field>
   <field name="point>3,3,3,3,3,3</field>
</doc>
<doc>
   <field name="pk">3</field>
   <field name="docId>10</field>
   <field name="point>4,4,4,4,4,4</field>
</doc>

It works fine and I don't think it gives us bad performance, but there are a
lot of redundant data (high disk space cost). That's why I thought about
multivalued fields:

<doc>
   <field name="docId">10</field>
   <field name="point">2,2,2,2,2,2</field>
   <field name="point">3,3,3,3,3,3</field>
   <field name="point">4,4,4,4,4,4</field>
</doc>
   
The first approach to implement this was PointType. But I have the problem
that I comment in my first message, the search queries will be a 6-Dimension
point that I have to full-match with the indexed points, and as far as I
know I cannot do it with PointType. 

With SpatialRecursivePrefixTreeFieldType would be perfect if I could use
more than two dimensions.

Regards,
Borja.



--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020616.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: PointType multivalued query

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Borja,
Umm, I'm quite confused with the use-case you present.
~ David



-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020609.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: PointType multivalued query

Posted by blopez <ba...@hotmail.com>.
Hi,

I think it's not a good idea to make Join operations between Solr cores
because of the performance (we managed a lot of data).

The point is that we want to store documents, each one with several
information sets (let's name them Points), each one identified by 6 values
(that's why I was trying to use 6-Dimensions PointType).

I'm doing this to try to improve the indexing space and time (and if
possible the retrieval time), because nowadays we have it implemented in
another index structure with these point values represented in a individual
Solr attribute. This way (showed below) I think is less efficient than what
I was trying to do with PointType:
<doc>
<pk>
<docToReference>
<pointValue1>
<pointValue2>
...
<pointValue6>
</doc>

So for the "docToReference"=1 we may have thousands of "point sets", what
implies having a lot of noise in the Solr index.

What do you think about that?

Thank you very much,
Borja.



--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020606.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: PointType multivalued query

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Oh I'm sorry, I should have read your question more clearly.  I totally
forgot that solr.PointType supports a configurable number of dimensions.  If
you need more than 2 dimensions as your example shows you do, then you'll
have to resort to indexing your spatial data in another Solr core as
non-multiValued and then use Solr 4's new join query to relate the data back
to your main query.  You probably won't be pleased with the performance if
you have a lot of data.  Eventually, I'm hopeful for Solr supporting index
level "block join" (I think it's called) which will speed this up.

What are you using this for, any way?



-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020554.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: PointType multivalued query

Posted by blopez <ba...@hotmail.com>.
Hi David,

thanks for your reply.

I've tested this datatype and the values are indexed fine (I'm using
6-dimensions points).

I'm trying to retrieve results and it works only with the 2 first dimensions
(X and Y), but it's not taking into account the others 4 dimensions. 

I've been reading the documentation you sent me but I cannot see an
attribute to define the number of dimensions I should use.

Do you know what's happening?

Regards,
Borja.



--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020551.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: PointType multivalued query

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Hi Borja.

Solr 3 PointType & LatLonType do not support multiValued fields.  Use Solr 4
SpatialRecursivePrefixTreeFieldType.  For you, use geo="false" and set
worldBounds & distErr to appropriate values.  See: 
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4

~ David Smiley



-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020500.html
Sent from the Solr - User mailing list archive at Nabble.com.