You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Mónica Marrero <mo...@europeana.eu> on 2021/06/09 15:01:19 UTC

Sorting is working in primitive multivalued fields without docValues

Hi,

I am using Solr 7.7 in Cloud, and I had understood from the
documentation that sorting is not possible in multivalued fields when
docValues is not activated. To my surprise I am able to directly sort (e.g.
sort=CREATOR asc) using the two fields below (I also copy the definition of
the field types):

<fieldType name="int" class="solr.TrieIntField" precisionStep="0"
positionIncrementGap="0"/><fieldType name="string" class="solr.StrField"
sortMissingLast="true" omitNorms="true"/>
<field name="CREATOR" type="string" indexed="true" stored="true" multiValued
="true"/>
<field name="filter_tags" type="int" indexed="true" stored="true"
multiValued="true"/>

Am I missing something? The schema version is still 0.8 in case that makes
any difference.

Thanks in advance for your help.

-- 
Disclaimer: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they 
are
addressed. If you have received this email in error please notify the 
system manager. If you are not the named addressee you should not 
disseminate,
distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete 
this email from your
system.

Re: Sorting is working in primitive multivalued fields without docValues

Posted by Mónica Marrero <mo...@europeana.eu>.
Thanks, Alessandro, I will wait for your investigation and let you know if
I find anything else in the meantime. About the trie fields, yes, we are
about to replace them with the point fields, hopefully in the next
reindexing we run.

Cheers,

Mónica



On Thu, 10 Jun 2021 at 10:30, Alessandro Benedetti <a....@sease.io>
wrote:

> Hi Monica,
> I think the trie fields are deprecated in favor of the point fields.
>
> In regards to multi valued date sorting without docValues, the overall pull
> request is the following:
> https://github.com/apache/lucene-solr/commit/e2bba98/
> I should investigate more but from a quick look , for dates we expect
> exceptions in the tests, so *it seems* you are getting the supposed
> behavior.
> Didn't have time to fully investigate it and understand it, so I may be
> wrong.
>
> Cheers
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr Committer
> Director, R&D Software Engineer, Search Consultant
>
> www.sease.io
>
>
> On Thu, 10 Jun 2021 at 09:22, Mónica Marrero <mo...@europeana.eu>
> wrote:
>
> > Thank you! Yes, that is the quote, and here is the link to the
> > documentation where it is:
> >
> >
> https://solr.apache.org/guide/7_7/common-query-parameters.html#sort-parameter
> >
> > I have also tested with date types (also multivalued and with no
> docValues)
> > and in that case I get the following error:
> > "can not sort on a field w/o docValues unless it is indexed=true
> > uninvertible=true and the type supports Uninversion: ww_cc_deprecated_on"
> >
> > The field:
> > <fieldType name="date" class="solr.TrieDateField" omitNorms="true"
> > precisionStep="0" positionIncrementGap="0"/>
> > <field name="wr_cc_deprecated_on" type="date" indexed="true"
> stored="true"
> > multiValued="true"/>
> >
> > I understand then that sorting is always allowed in fields that supports
> > uninversion, and that is not the case for all the primitive types.
> >
> > Best,
> >
> > Mónica
> >
> > On Wed, 9 Jun 2021 at 20:00, Alessandro Benedetti <a....@sease.io>
> > wrote:
> >
> > > From the wiki:
> > >
> > > The value of any primitive field (numerics, string, boolean, dates,
> etc.)
> > > > which has docValues="true" (or multiValued="false" and
> indexed="true",
> > in
> > > > which case the indexed terms will used to build DocValue like
> > structures
> > > on
> > > > the fly at runtime)
> > >
> > >
> > > I think the documentation is incorrect.
> > >
> > > Taking a look to the code:
> > >
> > >
> >
> org.apache.solr.schema.PrimitiveFieldType#getDefaultMultiValueSelectorForSort
> > > <goog_1918207994>
> > > https://issues.apache.org/jira/browse/SOLR-11854
> > > https://github.com/apache/lucene-solr/commit/e2bba98/
> > >
> > > <https://github.com/apache/lucene-solr/commit/e2bba98/>
> > > I <https://github.com/apache/lucene-solr/commit/e2bba98/>t seems the
> > > documentation was correct, at the time of the original Jira?
> > > Maybe a regression happened.
> > > Will spend some more time tomorrow
> > >
> > > --------------------------
> > > Alessandro Benedetti
> > > Apache Lucene/Solr Committer
> > > Director, R&D Software Engineer, Search Consultant
> > >
> > > www.sease.io
> > >
> > >
> > > On Wed, 9 Jun 2021 at 17:04, Alexandre Rafalovitch <arafalov@gmail.com
> >
> > > wrote:
> > >
> > > > I am pretty sure your reality is correct and your document reading
> (or
> > > > document itself) is less than perfect. docValues are strongly
> > > > recommended if you are going to do a lot of sorting. But the ability
> > > > to sort existed before docValues were created.
> > > >
> > > > Can you send the specific (version-specific ideally) link and quote
> > > > that confuses you?
> > > >
> > > > Regards,
> > > >    Alex.
> > > >
> > > > On Wed, 9 Jun 2021 at 11:02, Mónica Marrero <
> > monica.marrero@europeana.eu
> > > >
> > > > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I am using Solr 7.7 in Cloud, and I had understood from the
> > > > > documentation that sorting is not possible in multivalued fields
> when
> > > > > docValues is not activated. To my surprise I am able to directly
> sort
> > > > (e.g.
> > > > > sort=CREATOR asc) using the two fields below (I also copy the
> > > definition
> > > > of
> > > > > the field types):
> > > > >
> > > > > <fieldType name="int" class="solr.TrieIntField" precisionStep="0"
> > > > > positionIncrementGap="0"/><fieldType name="string"
> > > class="solr.StrField"
> > > > > sortMissingLast="true" omitNorms="true"/>
> > > > > <field name="CREATOR" type="string" indexed="true" stored="true"
> > > > multiValued
> > > > > ="true"/>
> > > > > <field name="filter_tags" type="int" indexed="true" stored="true"
> > > > > multiValued="true"/>
> > > > >
> > > > > Am I missing something? The schema version is still 0.8 in case
> that
> > > > makes
> > > > > any difference.
> > > > >
> > > > > Thanks in advance for your help.
> > > > >
> > > > > --
> > > > > Disclaimer: This email and any files transmitted with it are
> > > confidential
> > > > > and intended solely for the use of the individual or entity to whom
> > > they
> > > > > are
> > > > > addressed. If you have received this email in error please notify
> the
> > > > > system manager. If you are not the named addressee you should not
> > > > > disseminate,
> > > > > distribute or copy this email. Please notify the sender
> > > > > immediately by email if you have received this email by mistake and
> > > > delete
> > > > > this email from your
> > > > > system.
> > > >
> > >
> >
> > --
> > Disclaimer: This email and any files transmitted with it are confidential
> > and intended solely for the use of the individual or entity to whom they
> > are
> > addressed. If you have received this email in error please notify the
> > system manager. If you are not the named addressee you should not
> > disseminate,
> > distribute or copy this email. Please notify the sender
> > immediately by email if you have received this email by mistake and
> delete
> > this email from your
> > system.
> >
>

-- 
Disclaimer: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they 
are
addressed. If you have received this email in error please notify the 
system manager. If you are not the named addressee you should not 
disseminate,
distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete 
this email from your
system.

Re: Sorting is working in primitive multivalued fields without docValues

Posted by Alessandro Benedetti <a....@sease.io>.
Hi Monica,
I think the trie fields are deprecated in favor of the point fields.

In regards to multi valued date sorting without docValues, the overall pull
request is the following:
https://github.com/apache/lucene-solr/commit/e2bba98/
I should investigate more but from a quick look , for dates we expect
exceptions in the tests, so *it seems* you are getting the supposed
behavior.
Didn't have time to fully investigate it and understand it, so I may be
wrong.

Cheers
--------------------------
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Thu, 10 Jun 2021 at 09:22, Mónica Marrero <mo...@europeana.eu>
wrote:

> Thank you! Yes, that is the quote, and here is the link to the
> documentation where it is:
>
> https://solr.apache.org/guide/7_7/common-query-parameters.html#sort-parameter
>
> I have also tested with date types (also multivalued and with no docValues)
> and in that case I get the following error:
> "can not sort on a field w/o docValues unless it is indexed=true
> uninvertible=true and the type supports Uninversion: ww_cc_deprecated_on"
>
> The field:
> <fieldType name="date" class="solr.TrieDateField" omitNorms="true"
> precisionStep="0" positionIncrementGap="0"/>
> <field name="wr_cc_deprecated_on" type="date" indexed="true" stored="true"
> multiValued="true"/>
>
> I understand then that sorting is always allowed in fields that supports
> uninversion, and that is not the case for all the primitive types.
>
> Best,
>
> Mónica
>
> On Wed, 9 Jun 2021 at 20:00, Alessandro Benedetti <a....@sease.io>
> wrote:
>
> > From the wiki:
> >
> > The value of any primitive field (numerics, string, boolean, dates, etc.)
> > > which has docValues="true" (or multiValued="false" and indexed="true",
> in
> > > which case the indexed terms will used to build DocValue like
> structures
> > on
> > > the fly at runtime)
> >
> >
> > I think the documentation is incorrect.
> >
> > Taking a look to the code:
> >
> >
> org.apache.solr.schema.PrimitiveFieldType#getDefaultMultiValueSelectorForSort
> > <goog_1918207994>
> > https://issues.apache.org/jira/browse/SOLR-11854
> > https://github.com/apache/lucene-solr/commit/e2bba98/
> >
> > <https://github.com/apache/lucene-solr/commit/e2bba98/>
> > I <https://github.com/apache/lucene-solr/commit/e2bba98/>t seems the
> > documentation was correct, at the time of the original Jira?
> > Maybe a regression happened.
> > Will spend some more time tomorrow
> >
> > --------------------------
> > Alessandro Benedetti
> > Apache Lucene/Solr Committer
> > Director, R&D Software Engineer, Search Consultant
> >
> > www.sease.io
> >
> >
> > On Wed, 9 Jun 2021 at 17:04, Alexandre Rafalovitch <ar...@gmail.com>
> > wrote:
> >
> > > I am pretty sure your reality is correct and your document reading (or
> > > document itself) is less than perfect. docValues are strongly
> > > recommended if you are going to do a lot of sorting. But the ability
> > > to sort existed before docValues were created.
> > >
> > > Can you send the specific (version-specific ideally) link and quote
> > > that confuses you?
> > >
> > > Regards,
> > >    Alex.
> > >
> > > On Wed, 9 Jun 2021 at 11:02, Mónica Marrero <
> monica.marrero@europeana.eu
> > >
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > I am using Solr 7.7 in Cloud, and I had understood from the
> > > > documentation that sorting is not possible in multivalued fields when
> > > > docValues is not activated. To my surprise I am able to directly sort
> > > (e.g.
> > > > sort=CREATOR asc) using the two fields below (I also copy the
> > definition
> > > of
> > > > the field types):
> > > >
> > > > <fieldType name="int" class="solr.TrieIntField" precisionStep="0"
> > > > positionIncrementGap="0"/><fieldType name="string"
> > class="solr.StrField"
> > > > sortMissingLast="true" omitNorms="true"/>
> > > > <field name="CREATOR" type="string" indexed="true" stored="true"
> > > multiValued
> > > > ="true"/>
> > > > <field name="filter_tags" type="int" indexed="true" stored="true"
> > > > multiValued="true"/>
> > > >
> > > > Am I missing something? The schema version is still 0.8 in case that
> > > makes
> > > > any difference.
> > > >
> > > > Thanks in advance for your help.
> > > >
> > > > --
> > > > Disclaimer: This email and any files transmitted with it are
> > confidential
> > > > and intended solely for the use of the individual or entity to whom
> > they
> > > > are
> > > > addressed. If you have received this email in error please notify the
> > > > system manager. If you are not the named addressee you should not
> > > > disseminate,
> > > > distribute or copy this email. Please notify the sender
> > > > immediately by email if you have received this email by mistake and
> > > delete
> > > > this email from your
> > > > system.
> > >
> >
>
> --
> Disclaimer: This email and any files transmitted with it are confidential
> and intended solely for the use of the individual or entity to whom they
> are
> addressed. If you have received this email in error please notify the
> system manager. If you are not the named addressee you should not
> disseminate,
> distribute or copy this email. Please notify the sender
> immediately by email if you have received this email by mistake and delete
> this email from your
> system.
>

Re: Sorting is working in primitive multivalued fields without docValues

Posted by Mónica Marrero <mo...@europeana.eu>.
Thank you! Yes, that is the quote, and here is the link to the
documentation where it is:
https://solr.apache.org/guide/7_7/common-query-parameters.html#sort-parameter

I have also tested with date types (also multivalued and with no docValues)
and in that case I get the following error:
"can not sort on a field w/o docValues unless it is indexed=true
uninvertible=true and the type supports Uninversion: ww_cc_deprecated_on"

The field:
<fieldType name="date" class="solr.TrieDateField" omitNorms="true"
precisionStep="0" positionIncrementGap="0"/>
<field name="wr_cc_deprecated_on" type="date" indexed="true" stored="true"
multiValued="true"/>

I understand then that sorting is always allowed in fields that supports
uninversion, and that is not the case for all the primitive types.

Best,

Mónica

On Wed, 9 Jun 2021 at 20:00, Alessandro Benedetti <a....@sease.io>
wrote:

> From the wiki:
>
> The value of any primitive field (numerics, string, boolean, dates, etc.)
> > which has docValues="true" (or multiValued="false" and indexed="true", in
> > which case the indexed terms will used to build DocValue like structures
> on
> > the fly at runtime)
>
>
> I think the documentation is incorrect.
>
> Taking a look to the code:
>
> org.apache.solr.schema.PrimitiveFieldType#getDefaultMultiValueSelectorForSort
> <goog_1918207994>
> https://issues.apache.org/jira/browse/SOLR-11854
> https://github.com/apache/lucene-solr/commit/e2bba98/
>
> <https://github.com/apache/lucene-solr/commit/e2bba98/>
> I <https://github.com/apache/lucene-solr/commit/e2bba98/>t seems the
> documentation was correct, at the time of the original Jira?
> Maybe a regression happened.
> Will spend some more time tomorrow
>
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr Committer
> Director, R&D Software Engineer, Search Consultant
>
> www.sease.io
>
>
> On Wed, 9 Jun 2021 at 17:04, Alexandre Rafalovitch <ar...@gmail.com>
> wrote:
>
> > I am pretty sure your reality is correct and your document reading (or
> > document itself) is less than perfect. docValues are strongly
> > recommended if you are going to do a lot of sorting. But the ability
> > to sort existed before docValues were created.
> >
> > Can you send the specific (version-specific ideally) link and quote
> > that confuses you?
> >
> > Regards,
> >    Alex.
> >
> > On Wed, 9 Jun 2021 at 11:02, Mónica Marrero <monica.marrero@europeana.eu
> >
> > wrote:
> > >
> > > Hi,
> > >
> > > I am using Solr 7.7 in Cloud, and I had understood from the
> > > documentation that sorting is not possible in multivalued fields when
> > > docValues is not activated. To my surprise I am able to directly sort
> > (e.g.
> > > sort=CREATOR asc) using the two fields below (I also copy the
> definition
> > of
> > > the field types):
> > >
> > > <fieldType name="int" class="solr.TrieIntField" precisionStep="0"
> > > positionIncrementGap="0"/><fieldType name="string"
> class="solr.StrField"
> > > sortMissingLast="true" omitNorms="true"/>
> > > <field name="CREATOR" type="string" indexed="true" stored="true"
> > multiValued
> > > ="true"/>
> > > <field name="filter_tags" type="int" indexed="true" stored="true"
> > > multiValued="true"/>
> > >
> > > Am I missing something? The schema version is still 0.8 in case that
> > makes
> > > any difference.
> > >
> > > Thanks in advance for your help.
> > >
> > > --
> > > Disclaimer: This email and any files transmitted with it are
> confidential
> > > and intended solely for the use of the individual or entity to whom
> they
> > > are
> > > addressed. If you have received this email in error please notify the
> > > system manager. If you are not the named addressee you should not
> > > disseminate,
> > > distribute or copy this email. Please notify the sender
> > > immediately by email if you have received this email by mistake and
> > delete
> > > this email from your
> > > system.
> >
>

-- 
Disclaimer: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they 
are
addressed. If you have received this email in error please notify the 
system manager. If you are not the named addressee you should not 
disseminate,
distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete 
this email from your
system.

Re: Sorting is working in primitive multivalued fields without docValues

Posted by Alessandro Benedetti <a....@sease.io>.
From the wiki:

The value of any primitive field (numerics, string, boolean, dates, etc.)
> which has docValues="true" (or multiValued="false" and indexed="true", in
> which case the indexed terms will used to build DocValue like structures on
> the fly at runtime)


I think the documentation is incorrect.

Taking a look to the code:
org.apache.solr.schema.PrimitiveFieldType#getDefaultMultiValueSelectorForSort
<goog_1918207994>
https://issues.apache.org/jira/browse/SOLR-11854
https://github.com/apache/lucene-solr/commit/e2bba98/

<https://github.com/apache/lucene-solr/commit/e2bba98/>
I <https://github.com/apache/lucene-solr/commit/e2bba98/>t seems the
documentation was correct, at the time of the original Jira?
Maybe a regression happened.
Will spend some more time tomorrow

--------------------------
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Wed, 9 Jun 2021 at 17:04, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> I am pretty sure your reality is correct and your document reading (or
> document itself) is less than perfect. docValues are strongly
> recommended if you are going to do a lot of sorting. But the ability
> to sort existed before docValues were created.
>
> Can you send the specific (version-specific ideally) link and quote
> that confuses you?
>
> Regards,
>    Alex.
>
> On Wed, 9 Jun 2021 at 11:02, Mónica Marrero <mo...@europeana.eu>
> wrote:
> >
> > Hi,
> >
> > I am using Solr 7.7 in Cloud, and I had understood from the
> > documentation that sorting is not possible in multivalued fields when
> > docValues is not activated. To my surprise I am able to directly sort
> (e.g.
> > sort=CREATOR asc) using the two fields below (I also copy the definition
> of
> > the field types):
> >
> > <fieldType name="int" class="solr.TrieIntField" precisionStep="0"
> > positionIncrementGap="0"/><fieldType name="string" class="solr.StrField"
> > sortMissingLast="true" omitNorms="true"/>
> > <field name="CREATOR" type="string" indexed="true" stored="true"
> multiValued
> > ="true"/>
> > <field name="filter_tags" type="int" indexed="true" stored="true"
> > multiValued="true"/>
> >
> > Am I missing something? The schema version is still 0.8 in case that
> makes
> > any difference.
> >
> > Thanks in advance for your help.
> >
> > --
> > Disclaimer: This email and any files transmitted with it are confidential
> > and intended solely for the use of the individual or entity to whom they
> > are
> > addressed. If you have received this email in error please notify the
> > system manager. If you are not the named addressee you should not
> > disseminate,
> > distribute or copy this email. Please notify the sender
> > immediately by email if you have received this email by mistake and
> delete
> > this email from your
> > system.
>

Re: Sorting is working in primitive multivalued fields without docValues

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
I am pretty sure your reality is correct and your document reading (or
document itself) is less than perfect. docValues are strongly
recommended if you are going to do a lot of sorting. But the ability
to sort existed before docValues were created.

Can you send the specific (version-specific ideally) link and quote
that confuses you?

Regards,
   Alex.

On Wed, 9 Jun 2021 at 11:02, Mónica Marrero <mo...@europeana.eu> wrote:
>
> Hi,
>
> I am using Solr 7.7 in Cloud, and I had understood from the
> documentation that sorting is not possible in multivalued fields when
> docValues is not activated. To my surprise I am able to directly sort (e.g.
> sort=CREATOR asc) using the two fields below (I also copy the definition of
> the field types):
>
> <fieldType name="int" class="solr.TrieIntField" precisionStep="0"
> positionIncrementGap="0"/><fieldType name="string" class="solr.StrField"
> sortMissingLast="true" omitNorms="true"/>
> <field name="CREATOR" type="string" indexed="true" stored="true" multiValued
> ="true"/>
> <field name="filter_tags" type="int" indexed="true" stored="true"
> multiValued="true"/>
>
> Am I missing something? The schema version is still 0.8 in case that makes
> any difference.
>
> Thanks in advance for your help.
>
> --
> Disclaimer: This email and any files transmitted with it are confidential
> and intended solely for the use of the individual or entity to whom they
> are
> addressed. If you have received this email in error please notify the
> system manager. If you are not the named addressee you should not
> disseminate,
> distribute or copy this email. Please notify the sender
> immediately by email if you have received this email by mistake and delete
> this email from your
> system.