You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Vijay Kokatnur <ko...@gmail.com> on 2014/03/13 04:16:22 UTC

Re-index Parent-Child Schema

Hi,

I've inherited an Solr application with a Schema that contains parent-child
relationship.  All child elements are maintained in multi-value fields.
So an Order with 3 Order lines will result in an array of size 3 in Solr,

This worked fine as long as clients queried only on Order, but with new
requirements it is serving inaccurate results.

Consider some orders, for example -


 {
OrderId:123
BookingRecordId : ["145", "987", "*234*"]
OrderLineType : ["11", "12", "*13*"]
.....
}
 {
OrderId:345
BookingRecordId : ["945", "882", "*234*"]
OrderLineType : ["1", "12", "*11*"]
.....
}
 {
OrderId:678
BookingRecordId : ["444"]
OrderLineType : ["11"]
.....
}


If you look up for an Order with BookingRecordId: 234 And OrderLineType:11.
 You will get two orders : 123 and 345, which is correct per Solr.   You
have two arrays in both the orders that satisfy this condition.

However, for OrderId:123, the value at 3rd index of OrderLineType array is
13 and not 11( this is for BookingRecordId:145) this should be excluded.

Per this blog :
http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html

I can't use span queries as I have tons of child elements to query and I
want to keep any changes to client queries to minimum.

So is creating multiple indexes is the only way? We have 3 Physical boxes
with SolrCloud and at some point we would like to shard.

Appreciate any inputs.


Best,

-Vijay

Re: Re-index Parent-Child Schema

Posted by Vijay Kokatnur <ko...@gmail.com>.
Hello Mikhail,

Thanks for the suggestions.  It took some time to get to this -

1. FieldsCollapsing cannot be done on Multivalue fields -
https://wiki.apache.org/solr/FieldCollapsing

2. Join acts on documents, how can I use it to join multi-value fields in
the same document?

3. Block-join requires you to index parent and child document separately
using IndexWriter.addDocuments API

4.  Concatenation requires me to index with those columns concatenated.
 This is not possible as I have around 20 multivalue fields.

Is there a way to solve this without changing how it's indexed?

Best,
-Vijay

On Thu, Mar 13, 2014 at 1:39 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> Hello Vijay,
> You can try FieldCollepsing, Join, Block-join, or just concatenate both
> field and search for concatenation.
>
>
> On Thu, Mar 13, 2014 at 7:16 AM, Vijay Kokatnur <kokatnur.vijay@gmail.com
> >wrote:
>
> > Hi,
> >
> > I've inherited an Solr application with a Schema that contains
> parent-child
> > relationship.  All child elements are maintained in multi-value fields.
> > So an Order with 3 Order lines will result in an array of size 3 in Solr,
> >
> > This worked fine as long as clients queried only on Order, but with new
> > requirements it is serving inaccurate results.
> >
> > Consider some orders, for example -
> >
> >
> >  {
> > OrderId:123
> > BookingRecordId : ["145", "987", "*234*"]
> > OrderLineType : ["11", "12", "*13*"]
> > .....
> > }
> >  {
> > OrderId:345
> > BookingRecordId : ["945", "882", "*234*"]
> > OrderLineType : ["1", "12", "*11*"]
> > .....
> > }
> >  {
> > OrderId:678
> > BookingRecordId : ["444"]
> > OrderLineType : ["11"]
> > .....
> > }
> >
> >
> > If you look up for an Order with BookingRecordId: 234 And
> OrderLineType:11.
> >  You will get two orders : 123 and 345, which is correct per Solr.   You
> > have two arrays in both the orders that satisfy this condition.
> >
> > However, for OrderId:123, the value at 3rd index of OrderLineType array
> is
> > 13 and not 11( this is for BookingRecordId:145) this should be excluded.
> >
> > Per this blog :
> >
> >
> http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html
> >
> > I can't use span queries as I have tons of child elements to query and I
> > want to keep any changes to client queries to minimum.
> >
> > So is creating multiple indexes is the only way? We have 3 Physical boxes
> > with SolrCloud and at some point we would like to shard.
> >
> > Appreciate any inputs.
> >
> >
> > Best,
> >
> > -Vijay
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  <mk...@griddynamics.com>
>

Re: Re-index Parent-Child Schema

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello Vijay,
You can try FieldCollepsing, Join, Block-join, or just concatenate both
field and search for concatenation.


On Thu, Mar 13, 2014 at 7:16 AM, Vijay Kokatnur <ko...@gmail.com>wrote:

> Hi,
>
> I've inherited an Solr application with a Schema that contains parent-child
> relationship.  All child elements are maintained in multi-value fields.
> So an Order with 3 Order lines will result in an array of size 3 in Solr,
>
> This worked fine as long as clients queried only on Order, but with new
> requirements it is serving inaccurate results.
>
> Consider some orders, for example -
>
>
>  {
> OrderId:123
> BookingRecordId : ["145", "987", "*234*"]
> OrderLineType : ["11", "12", "*13*"]
> .....
> }
>  {
> OrderId:345
> BookingRecordId : ["945", "882", "*234*"]
> OrderLineType : ["1", "12", "*11*"]
> .....
> }
>  {
> OrderId:678
> BookingRecordId : ["444"]
> OrderLineType : ["11"]
> .....
> }
>
>
> If you look up for an Order with BookingRecordId: 234 And OrderLineType:11.
>  You will get two orders : 123 and 345, which is correct per Solr.   You
> have two arrays in both the orders that satisfy this condition.
>
> However, for OrderId:123, the value at 3rd index of OrderLineType array is
> 13 and not 11( this is for BookingRecordId:145) this should be excluded.
>
> Per this blog :
>
> http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html
>
> I can't use span queries as I have tons of child elements to query and I
> want to keep any changes to client queries to minimum.
>
> So is creating multiple indexes is the only way? We have 3 Physical boxes
> with SolrCloud and at some point we would like to shard.
>
> Appreciate any inputs.
>
>
> Best,
>
> -Vijay
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>