You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Utkarsh Sengar <ut...@gmail.com> on 2014/04/30 21:24:41 UTC

Denormalize or use multivalued field for nested data?

I have to modify a schema where I can attach nested "pricing per store"
information for a product. For example:

10010137332:{
   title:"iPad 64gb"
   description: "iPad 64gb with retina"
   pricing:{
        merchantid64354:{
              locationid643:{
                 "USD|600"
              }
              locationid6436:{
                 "USD|600"
              }
        }
        merchantid343:{
              locationid1345:{
                 "USD|600"
              }
              locationid4353:{
                 "USD|600"
              }
        }
   }
}


This is what is suggested all over the internet:
Denormalize it: In my case, I will end up with total number of columns =
total locations with a price which is about 100k. I don't think having 100k
columns for 60M products is a good idea.

Are there any better ways of handling this?
I am trying to figure out multivalue field but as far as I understand it,
it can only be used as a "flag" but cannot be used to get a value
associated to a key.

Based on this answer, solr 4.5+ supports nested documents:
http://stackoverflow.com/a/5585891/231917 but I am currently on 4.4.



-- 
Thanks,
-Utkarsh

Re: Denormalize or use multivalued field for nested data?

Posted by Anshum Gupta <an...@anshumgupta.net>.
Block joins could be what you're looking for if you can upgrade to 4.5+ [
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers
]

I'd recommend an upgrade but if that's not possible, replicating the parent
information is the way to go.




On Wed, Apr 30, 2014 at 12:24 PM, Utkarsh Sengar <ut...@gmail.com>wrote:

> I have to modify a schema where I can attach nested "pricing per store"
> information for a product. For example:
>
> 10010137332:{
>    title:"iPad 64gb"
>    description: "iPad 64gb with retina"
>    pricing:{
>         merchantid64354:{
>               locationid643:{
>                  "USD|600"
>               }
>               locationid6436:{
>                  "USD|600"
>               }
>         }
>         merchantid343:{
>               locationid1345:{
>                  "USD|600"
>               }
>               locationid4353:{
>                  "USD|600"
>               }
>         }
>    }
> }
>
>
> This is what is suggested all over the internet:
> Denormalize it: In my case, I will end up with total number of columns =
> total locations with a price which is about 100k. I don't think having 100k
> columns for 60M products is a good idea.
>
> Are there any better ways of handling this?
> I am trying to figure out multivalue field but as far as I understand it,
> it can only be used as a "flag" but cannot be used to get a value
> associated to a key.
>
> Based on this answer, solr 4.5+ supports nested documents:
> http://stackoverflow.com/a/5585891/231917 but I am currently on 4.4.
>
>
>
> --
> Thanks,
> -Utkarsh
>



-- 

Anshum Gupta
http://www.anshumgupta.net

Re: Denormalize or use multivalued field for nested data?

Posted by Erick Erickson <er...@gmail.com>.
I think you are misunderstanding "denormalize" in this context. It
still may not be what you want to do for other reasons, but the usual
idea is to replicate the parent info in each of the children, so you'd
have something like:


doc1 = title:"iPad 64gb" description: "iPad 64gb with retina"
merchantid:343 locationid: 1345 cost: USD|600

doc2 = title:"iPad 64gb" description: "iPad 64gb with retina"
merchantid:343 locationid: 4353 cost: USD|600

And so on.

Best,
Erick

On Wed, Apr 30, 2014 at 12:24 PM, Utkarsh Sengar <ut...@gmail.com> wrote:
> I have to modify a schema where I can attach nested "pricing per store"
> information for a product. For example:
>
> 10010137332:{
>    title:"iPad 64gb"
>    description: "iPad 64gb with retina"
>    pricing:{
>         merchantid64354:{
>               locationid643:{
>                  "USD|600"
>               }
>               locationid6436:{
>                  "USD|600"
>               }
>         }
>         merchantid343:{
>               locationid1345:{
>                  "USD|600"
>               }
>               locationid4353:{
>                  "USD|600"
>               }
>         }
>    }
> }
>
>
> This is what is suggested all over the internet:
> Denormalize it: In my case, I will end up with total number of columns =
> total locations with a price which is about 100k. I don't think having 100k
> columns for 60M products is a good idea.
>
> Are there any better ways of handling this?
> I am trying to figure out multivalue field but as far as I understand it,
> it can only be used as a "flag" but cannot be used to get a value
> associated to a key.
>
> Based on this answer, solr 4.5+ supports nested documents:
> http://stackoverflow.com/a/5585891/231917 but I am currently on 4.4.
>
>
>
> --
> Thanks,
> -Utkarsh