You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Vincenzo D'Amore <v....@gmail.com> on 2020/07/02 17:52:34 UTC

Solr Float/Double multivalues fields

Hi all,

simple question: Solr float/double multivalue fields preserve the order of
inserted values?

Best regards,
Vincenzo

-- 
Vincenzo D'Amore

Re: Solr Float/Double multivalues fields

Posted by Vincenzo D'Amore <v....@gmail.com>.
Thanks for sharing the post, finally I had the time to read it :)
It is really illuminating

On Fri, Jul 3, 2020 at 1:28 PM Toke Eskildsen <to...@kb.dk> wrote:

> On Fri, 2020-07-03 at 10:00 +0200, Vincenzo D'Amore wrote:
> > Hi Erick, not sure I got.
> > Does this mean that the order of values within a multivalued field:
> > - docValues=true the result will be both re-ordered and deduplicated.
> > - docValues=false the result order is guaranteed to be maintained for
> > values in the insertion-order.
> >
> > Is this correct?
>
> Sorta, but it is not the complete picture. Things gets complicated when
> you mix it with stored, so that you have "stored=true docValues=true".
> There's an article about that at
>
>
> https://sease.io/2020/03/docvalues-vs-stored-fields-apache-solr-features-and-performance-smackdown.html
>
> BTW: The documentation should definitely mention that stored preserves
> order & duplicates. It is not obvious.
>
> - Toke Eskildsen, Royal Danish Library
>
>
>

-- 
Vincenzo D'Amore

Re: Solr Float/Double multivalues fields

Posted by Toke Eskildsen <to...@kb.dk>.
On Fri, 2020-07-03 at 10:00 +0200, Vincenzo D'Amore wrote:
> Hi Erick, not sure I got.
> Does this mean that the order of values within a multivalued field:
> - docValues=true the result will be both re-ordered and deduplicated.
> - docValues=false the result order is guaranteed to be maintained for
> values in the insertion-order.
> 
> Is this correct?

Sorta, but it is not the complete picture. Things gets complicated when
you mix it with stored, so that you have "stored=true docValues=true".
There's an article about that at

https://sease.io/2020/03/docvalues-vs-stored-fields-apache-solr-features-and-performance-smackdown.html

BTW: The documentation should definitely mention that stored preserves
order & duplicates. It is not obvious.

- Toke Eskildsen, Royal Danish Library



Re: Solr Float/Double multivalues fields

Posted by Vincenzo D'Amore <v....@gmail.com>.
Hi Erick, not sure I got.
Does this mean that the order of values within a multivalued field:
- docValues=true the result will be both re-ordered and deduplicated.
- docValues=false the result order is guaranteed to be maintained for
values in the insertion-order.

Is this correct?

On Thu, Jul 2, 2020 at 8:37 PM Erick Erickson <er...@gmail.com>
wrote:

> This is true _unless_ you fetch from docValues. docValues are SORTED_SETs,
> so the results will be both ordered and deduplicated if you return them
> as part of the field list.
>
> Don’t really think it needs to go into the ref guide, it’s just inherent
> in storing
> any kind of value. You wouldn’t expect multiple text entries in a
> multiValued
> field to be rearranged when returning the stored values either.
>
> Best,
> Erick
>
> > On Jul 2, 2020, at 2:21 PM, Vincenzo D'Amore <v....@gmail.com> wrote:
> >
> > Thanks, and genuinely asking: is there written somewhere in the
> > documentation too? If no, could anyone suggest to me which doc page
> should
> > I try to update?
> >
> > On Thu, Jul 2, 2020 at 8:08 PM Colvin Cowie <co...@gmail.com>
> > wrote:
> >
> >> The order of values within a multivalued field should match the
> insertion
> >> order. -- we certainly rely on that in our product.
> >>
> >> Order is guaranteed to be maintained for values in a multi-valued field.
> >>>
> >>
> >>
> https://lucene.472066.n3.nabble.com/order-question-on-solr-multi-value-field-tp4027695p4028057.html
> >>
> >> On Thu, 2 Jul 2020 at 18:52, Vincenzo D'Amore <v....@gmail.com>
> wrote:
> >>
> >>> Hi all,
> >>>
> >>> simple question: Solr float/double multivalue fields preserve the order
> >> of
> >>> inserted values?
> >>>
> >>> Best regards,
> >>> Vincenzo
> >>>
> >>> --
> >>> Vincenzo D'Amore
> >>>
> >>
> >
> >
> > --
> > Vincenzo D'Amore
>
>

-- 
Vincenzo D'Amore

Re: Solr Float/Double multivalues fields

Posted by Thomas Corthals <th...@klascement.net>.
Op vr 3 jul. 2020 om 14:11 schreef Bram Van Dam <br...@intix.eu>:

> On 03/07/2020 09:50, Thomas Corthals wrote:
> > I think this should go in the ref guide. If your product depends on this
> > behaviour, you want reassurance that it isn't going to change in the next
> > release. Not everyone will go looking through the javadoc to see if this
> is
> > implied.
>
> This is in the ref guide. Section DocValues. Here's the quote:
>
> DocValues are only available for specific field types. The types chosen
> determine the underlying Lucene
> docValue type that will be used. The available Solr field types are:
> • StrField, and UUIDField:
> ◦ If the field is single-valued (i.e., multi-valued is false), Lucene
> will use the SORTED type.
> ◦ If the field is multi-valued, Lucene will use the SORTED_SET type.
> Entries are kept in sorted order and
> duplicates are removed.
> • BoolField:
> ◦ If the field is single-valued (i.e., multi-valued is false), Lucene
> will use the SORTED type.
> © 2019, Apache Software Foundation
>  Guide Version 7.7 - Published: 2019-03-04
> Page 212 of 1426
>  Apache Solr Reference Guide 7.7
> ◦ If the field is multi-valued, Lucene will use the SORTED_SET type.
> Entries are kept in sorted order and
> duplicates are removed.
> • Any *PointField Numeric or Date fields, EnumFieldType, and
> CurrencyFieldType:
> ◦ If the field is single-valued (i.e., multi-valued is false), Lucene
> will use the NUMERIC type.
> ◦ If the field is multi-valued, Lucene will use the SORTED_NUMERIC type.
> Entries are kept in sorted order
> and duplicates are kept.
> • Any of the deprecated Trie* Numeric or Date fields, EnumField and
> CurrencyField:
> ◦ If the field is single-valued (i.e., multi-valued is false), Lucene
> will use the NUMERIC type.
> ◦ If the field is multi-valued, Lucene will use the SORTED_SET type.
> Entries are kept in sorted order and
> duplicates are removed.
> These Lucene types are related to how the values are sorted and stored.
>

Great for docValues. But I couldn't find anything similar for multiValued
in the field type pages of the ref guide (unless I totally missed it
of course). It doesn't have to be as elaborate, as long as it's clear and
doesn't leave users wondering or assuming.

Re: Solr Float/Double multivalues fields

Posted by Bram Van Dam <br...@intix.eu>.
On 03/07/2020 09:50, Thomas Corthals wrote:
> I think this should go in the ref guide. If your product depends on this
> behaviour, you want reassurance that it isn't going to change in the next
> release. Not everyone will go looking through the javadoc to see if this is
> implied.

This is in the ref guide. Section DocValues. Here's the quote:

DocValues are only available for specific field types. The types chosen
determine the underlying Lucene
docValue type that will be used. The available Solr field types are:
• StrField, and UUIDField:
◦ If the field is single-valued (i.e., multi-valued is false), Lucene
will use the SORTED type.
◦ If the field is multi-valued, Lucene will use the SORTED_SET type.
Entries are kept in sorted order and
duplicates are removed.
• BoolField:
◦ If the field is single-valued (i.e., multi-valued is false), Lucene
will use the SORTED type.
© 2019, Apache Software Foundation
 Guide Version 7.7 - Published: 2019-03-04
Page 212 of 1426
 Apache Solr Reference Guide 7.7
◦ If the field is multi-valued, Lucene will use the SORTED_SET type.
Entries are kept in sorted order and
duplicates are removed.
• Any *PointField Numeric or Date fields, EnumFieldType, and
CurrencyFieldType:
◦ If the field is single-valued (i.e., multi-valued is false), Lucene
will use the NUMERIC type.
◦ If the field is multi-valued, Lucene will use the SORTED_NUMERIC type.
Entries are kept in sorted order
and duplicates are kept.
• Any of the deprecated Trie* Numeric or Date fields, EnumField and
CurrencyField:
◦ If the field is single-valued (i.e., multi-valued is false), Lucene
will use the NUMERIC type.
◦ If the field is multi-valued, Lucene will use the SORTED_SET type.
Entries are kept in sorted order and
duplicates are removed.
These Lucene types are related to how the values are sorted and stored.




Re: Solr Float/Double multivalues fields

Posted by Thomas Corthals <th...@klascement.net>.
I think this should go in the ref guide. If your product depends on this
behaviour, you want reassurance that it isn't going to change in the next
release. Not everyone will go looking through the javadoc to see if this is
implied.

Typically it'll either be something like "are always returned in insertion
order" or "are currently returned in insertion order, but your code
shouldn't rely on this behaviour because it can change in future releases".
That's usually sufficient to make an informed decision on how to handle
returned values.

If it's different for docValues, that's even more reason to state it
clearly in the ref guide to avoid confusion.

Best,
Thomas

Op do 2 jul. 2020 om 20:37 schreef Erick Erickson <er...@gmail.com>:

> This is true _unless_ you fetch from docValues. docValues are SORTED_SETs,
> so the results will be both ordered and deduplicated if you return them
> as part of the field list.
>
> Don’t really think it needs to go into the ref guide, it’s just inherent
> in storing
> any kind of value. You wouldn’t expect multiple text entries in a
> multiValued
> field to be rearranged when returning the stored values either.
>
> Best,
> Erick
>
> > On Jul 2, 2020, at 2:21 PM, Vincenzo D'Amore <v....@gmail.com> wrote:
> >
> > Thanks, and genuinely asking: is there written somewhere in the
> > documentation too? If no, could anyone suggest to me which doc page
> should
> > I try to update?
> >
> > On Thu, Jul 2, 2020 at 8:08 PM Colvin Cowie <co...@gmail.com>
> > wrote:
> >
> >> The order of values within a multivalued field should match the
> insertion
> >> order. -- we certainly rely on that in our product.
> >>
> >> Order is guaranteed to be maintained for values in a multi-valued field.
> >>>
> >>
> >>
> https://lucene.472066.n3.nabble.com/order-question-on-solr-multi-value-field-tp4027695p4028057.html
> >>
> >> On Thu, 2 Jul 2020 at 18:52, Vincenzo D'Amore <v....@gmail.com>
> wrote:
> >>
> >>> Hi all,
> >>>
> >>> simple question: Solr float/double multivalue fields preserve the order
> >> of
> >>> inserted values?
> >>>
> >>> Best regards,
> >>> Vincenzo
> >>>
> >>> --
> >>> Vincenzo D'Amore
> >>>
> >>
> >
> >
> > --
> > Vincenzo D'Amore
>
>

Re: Solr Float/Double multivalues fields

Posted by Erick Erickson <er...@gmail.com>.
This is true _unless_ you fetch from docValues. docValues are SORTED_SETs,
so the results will be both ordered and deduplicated if you return them
as part of the field list.

Don’t really think it needs to go into the ref guide, it’s just inherent in storing
any kind of value. You wouldn’t expect multiple text entries in a multiValued
field to be rearranged when returning the stored values either.

Best,
Erick

> On Jul 2, 2020, at 2:21 PM, Vincenzo D'Amore <v....@gmail.com> wrote:
> 
> Thanks, and genuinely asking: is there written somewhere in the
> documentation too? If no, could anyone suggest to me which doc page should
> I try to update?
> 
> On Thu, Jul 2, 2020 at 8:08 PM Colvin Cowie <co...@gmail.com>
> wrote:
> 
>> The order of values within a multivalued field should match the insertion
>> order. -- we certainly rely on that in our product.
>> 
>> Order is guaranteed to be maintained for values in a multi-valued field.
>>> 
>> 
>> https://lucene.472066.n3.nabble.com/order-question-on-solr-multi-value-field-tp4027695p4028057.html
>> 
>> On Thu, 2 Jul 2020 at 18:52, Vincenzo D'Amore <v....@gmail.com> wrote:
>> 
>>> Hi all,
>>> 
>>> simple question: Solr float/double multivalue fields preserve the order
>> of
>>> inserted values?
>>> 
>>> Best regards,
>>> Vincenzo
>>> 
>>> --
>>> Vincenzo D'Amore
>>> 
>> 
> 
> 
> -- 
> Vincenzo D'Amore


Re: Solr Float/Double multivalues fields

Posted by Vincenzo D'Amore <v....@gmail.com>.
Thanks, and genuinely asking: is there written somewhere in the
documentation too? If no, could anyone suggest to me which doc page should
I try to update?

On Thu, Jul 2, 2020 at 8:08 PM Colvin Cowie <co...@gmail.com>
wrote:

> The order of values within a multivalued field should match the insertion
> order. -- we certainly rely on that in our product.
>
> Order is guaranteed to be maintained for values in a multi-valued field.
> >
>
> https://lucene.472066.n3.nabble.com/order-question-on-solr-multi-value-field-tp4027695p4028057.html
>
> On Thu, 2 Jul 2020 at 18:52, Vincenzo D'Amore <v....@gmail.com> wrote:
>
> > Hi all,
> >
> > simple question: Solr float/double multivalue fields preserve the order
> of
> > inserted values?
> >
> > Best regards,
> > Vincenzo
> >
> > --
> > Vincenzo D'Amore
> >
>


-- 
Vincenzo D'Amore

Re: Solr Float/Double multivalues fields

Posted by Colvin Cowie <co...@gmail.com>.
The order of values within a multivalued field should match the insertion
order. -- we certainly rely on that in our product.

Order is guaranteed to be maintained for values in a multi-valued field.
>
https://lucene.472066.n3.nabble.com/order-question-on-solr-multi-value-field-tp4027695p4028057.html

On Thu, 2 Jul 2020 at 18:52, Vincenzo D'Amore <v....@gmail.com> wrote:

> Hi all,
>
> simple question: Solr float/double multivalue fields preserve the order of
> inserted values?
>
> Best regards,
> Vincenzo
>
> --
> Vincenzo D'Amore
>