You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pavel Belenkovich <Pa...@exlibrisgroup.com> on 2014/05/14 17:12:47 UTC

multiValued filed vs separate fields

Hi,

I wonder about performance difference of 2 indexing options:

1-      multivalued field

2-      separate fields

The case is as follows:
Each document has 100 "properties": prop1..prop100.
The values are strings and there is no relation between different properties.
I would like to search by exact match on several properties by known values (like ids).
For example: search for all docs having prop1="blue" and prop6="high"

I can choose to build the indexes in 1 of 2 ways:

1-      the trivial way - 100 separate fields, 1 for each property, multiValued=false. the values are just property values.

2-      1 field (named "properties") multiValued=true. The field will have 100 values: value1="prop1:blue".. value6="high" etc

Is it correct to say that option1 will have much better performance in searching?
How about saving performance?

thanx,
Pavel



Re: multiValued filed vs separate fields

Posted by Erick Erickson <er...@gmail.com>.
I'd go with 100 separate fields I think, it's a more "natural" mapping
and probably expresses the underlying structure better.

Besides, I'd expect the index to be smaller, you wouldn't be storing
the property name over and over and over...

Best,
Erick

On Wed, May 14, 2014 at 8:12 AM, Pavel Belenkovich
<Pa...@exlibrisgroup.com> wrote:
> Hi,
>
> I wonder about performance difference of 2 indexing options:
>
> 1-      multivalued field
>
> 2-      separate fields
>
> The case is as follows:
> Each document has 100 "properties": prop1..prop100.
> The values are strings and there is no relation between different properties.
> I would like to search by exact match on several properties by known values (like ids).
> For example: search for all docs having prop1="blue" and prop6="high"
>
> I can choose to build the indexes in 1 of 2 ways:
>
> 1-      the trivial way - 100 separate fields, 1 for each property, multiValued=false. the values are just property values.
>
> 2-      1 field (named "properties") multiValued=true. The field will have 100 values: value1="prop1:blue".. value6="high" etc
>
> Is it correct to say that option1 will have much better performance in searching?
> How about saving performance?
>
> thanx,
> Pavel
>
>