You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by uyilmaz <uy...@vivaldi.net.INVALID> on 2020/10/19 16:50:34 UTC

Faceting on indexed=false stored=false docValues=true fields

Hey all,

From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:

<field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>

I'm suprised by this, I thought I would need to index it. Can you confirm this?

Regards

-- 
uyilmaz <uy...@vivaldi.net>

Re: Faceting on indexed=false stored=false docValues=true fields

Posted by Walter Underwood <wu...@wunderwood.org>.
Hmm. Fields used for faceting will also be used for filtering, which is a kind
of search. Are docValues OK for filtering? I expect they might be slow the
first time, then cached.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 19, 2020, at 11:15 AM, Erick Erickson <er...@gmail.com> wrote:
> 
> uyilmaz:
> 
> Hmm, that _is_ confusing. And inaccurate.
> 
> In this context, it should read something like
> 
> The Text field should have indexed="true" docValues=“false" if used for searching 
> but not faceting and the String field should have indexed="false" docValues=“true"
> if used for faceting but not searching.
> 
> I’ll fix this, thanks for pointing this out.
> 
> Erick
> 
>> On Oct 19, 2020, at 1:42 PM, uyilmaz <uy...@vivaldi.net.INVALID> wrote:
>> 
>> Thanks! This also contributed to my confusion:
>> 
>> https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
>> 
>> "If you want Solr to perform both analysis (for searching) and faceting on the full literal strings, use the copyField directive in your Schema to create two versions of the field: one Text and one String. Make sure both are indexed="true"."
>> 
>> On Mon, 19 Oct 2020 13:08:00 -0400
>> Alexandre Rafalovitch <ar...@gmail.com> wrote:
>> 
>>> I think this is all explained quite well in the Ref Guide:
>>> https://lucene.apache.org/solr/guide/8_6/docvalues.html
>>> 
>>> DocValues is a different way to index/store values. Faceting is a
>>> primary use case where docValues are better than what 'indexed=true'
>>> gives you.
>>> 
>>> Regards,
>>>  Alex.
>>> 
>>> On Mon, 19 Oct 2020 at 12:51, uyilmaz <uy...@vivaldi.net.invalid> wrote:
>>>> 
>>>> 
>>>> Hey all,
>>>> 
>>>> From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:
>>>> 
>>>> <field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>
>>>> 
>>>> I'm suprised by this, I thought I would need to index it. Can you confirm this?
>>>> 
>>>> Regards
>>>> 
>>>> --
>>>> uyilmaz <uy...@vivaldi.net>
>> 
>> 
>> -- 
>> uyilmaz <uy...@vivaldi.net>
> 


Re: Faceting on indexed=false stored=false docValues=true fields

Posted by Erick Erickson <er...@gmail.com>.
uyilmaz:

Hmm, that _is_ confusing. And inaccurate.

In this context, it should read something like

The Text field should have indexed="true" docValues=“false" if used for searching 
but not faceting and the String field should have indexed="false" docValues=“true"
if used for faceting but not searching.

I’ll fix this, thanks for pointing this out.

Erick

> On Oct 19, 2020, at 1:42 PM, uyilmaz <uy...@vivaldi.net.INVALID> wrote:
> 
> Thanks! This also contributed to my confusion:
> 
> https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
> 
> "If you want Solr to perform both analysis (for searching) and faceting on the full literal strings, use the copyField directive in your Schema to create two versions of the field: one Text and one String. Make sure both are indexed="true"."
> 
> On Mon, 19 Oct 2020 13:08:00 -0400
> Alexandre Rafalovitch <ar...@gmail.com> wrote:
> 
>> I think this is all explained quite well in the Ref Guide:
>> https://lucene.apache.org/solr/guide/8_6/docvalues.html
>> 
>> DocValues is a different way to index/store values. Faceting is a
>> primary use case where docValues are better than what 'indexed=true'
>> gives you.
>> 
>> Regards,
>>   Alex.
>> 
>> On Mon, 19 Oct 2020 at 12:51, uyilmaz <uy...@vivaldi.net.invalid> wrote:
>>> 
>>> 
>>> Hey all,
>>> 
>>> From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:
>>> 
>>> <field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>
>>> 
>>> I'm suprised by this, I thought I would need to index it. Can you confirm this?
>>> 
>>> Regards
>>> 
>>> --
>>> uyilmaz <uy...@vivaldi.net>
> 
> 
> -- 
> uyilmaz <uy...@vivaldi.net>


Re: Faceting on indexed=false stored=false docValues=true fields

Posted by uyilmaz <uy...@vivaldi.net.INVALID>.
Sorry, correction, taking "the" time

On Mon, 19 Oct 2020 22:18:30 +0300
uyilmaz <uy...@vivaldi.net.INVALID> wrote:

> Thanks for taking time to write a detailed answer.
> 
> We use Solr to both store our data and to perform aggregations, using faceting or streaming expressions. When required analysis is too complex to do in Solr, we export large query results from Solr to a more capable analysis tool.
> 
> So I guess all fields need to be docValues="true", because export handler and streaming both require fields to have docValues, and even if I won't use a field in queries or facets, it should be in available to read in result set. Fields that won't be searched or faceted can be (indexed=false stored=false docValues=true) right?
> 
> --uyilmaz
> 
> 
> On Mon, 19 Oct 2020 14:14:27 -0400
> Michael Gibney <mi...@michaelgibney.net> wrote:
> 
> > As you've observed, it is indeed possible to facet on fields with
> > docValues=true, indexed=false; but in almost all cases you should
> > probably set indexed=true. 1. for distributed facet count refinement,
> > the "indexed" approach is used to look up counts by value; 2. assuming
> > you're wanting to do something usual, e.g. allow users to apply
> > filters based on facet counts, the filter application would use the
> > "indexed" approach as well. Where indexed=false, if either filtering
> > or distributed refinement is attempted, I'm not 100% sure what
> > happens. It might fail, or lead to inconsistent results, or attempt to
> > look up results via the equivalent of a "table scan" over docValues (I
> > think the last of these is what actually happens, fwiw) ... but none
> > of these options is likely desirable.
> > 
> > Michael
> > 
> > On Mon, Oct 19, 2020 at 1:42 PM uyilmaz <uy...@vivaldi.net.invalid> wrote:
> > >
> > > Thanks! This also contributed to my confusion:
> > >
> > > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
> > >
> > > "If you want Solr to perform both analysis (for searching) and faceting on the full literal strings, use the copyField directive in your Schema to create two versions of the field: one Text and one String. Make sure both are indexed="true"."
> > >
> > > On Mon, 19 Oct 2020 13:08:00 -0400
> > > Alexandre Rafalovitch <ar...@gmail.com> wrote:
> > >
> > > > I think this is all explained quite well in the Ref Guide:
> > > > https://lucene.apache.org/solr/guide/8_6/docvalues.html
> > > >
> > > > DocValues is a different way to index/store values. Faceting is a
> > > > primary use case where docValues are better than what 'indexed=true'
> > > > gives you.
> > > >
> > > > Regards,
> > > >    Alex.
> > > >
> > > > On Mon, 19 Oct 2020 at 12:51, uyilmaz <uy...@vivaldi.net.invalid> wrote:
> > > > >
> > > > >
> > > > > Hey all,
> > > > >
> > > > > From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:
> > > > >
> > > > > <field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>
> > > > >
> > > > > I'm suprised by this, I thought I would need to index it. Can you confirm this?
> > > > >
> > > > > Regards
> > > > >
> > > > > --
> > > > > uyilmaz <uy...@vivaldi.net>
> > >
> > >
> > > --
> > > uyilmaz <uy...@vivaldi.net>
> 
> 
> -- 
> uyilmaz <uy...@vivaldi.net>


-- 
uyilmaz <uy...@vivaldi.net>

Re: Faceting on indexed=false stored=false docValues=true fields

Posted by uyilmaz <uy...@vivaldi.net.INVALID>.
Thanks for taking time to write a detailed answer.

We use Solr to both store our data and to perform aggregations, using faceting or streaming expressions. When required analysis is too complex to do in Solr, we export large query results from Solr to a more capable analysis tool.

So I guess all fields need to be docValues="true", because export handler and streaming both require fields to have docValues, and even if I won't use a field in queries or facets, it should be in available to read in result set. Fields that won't be searched or faceted can be (indexed=false stored=false docValues=true) right?

--uyilmaz


On Mon, 19 Oct 2020 14:14:27 -0400
Michael Gibney <mi...@michaelgibney.net> wrote:

> As you've observed, it is indeed possible to facet on fields with
> docValues=true, indexed=false; but in almost all cases you should
> probably set indexed=true. 1. for distributed facet count refinement,
> the "indexed" approach is used to look up counts by value; 2. assuming
> you're wanting to do something usual, e.g. allow users to apply
> filters based on facet counts, the filter application would use the
> "indexed" approach as well. Where indexed=false, if either filtering
> or distributed refinement is attempted, I'm not 100% sure what
> happens. It might fail, or lead to inconsistent results, or attempt to
> look up results via the equivalent of a "table scan" over docValues (I
> think the last of these is what actually happens, fwiw) ... but none
> of these options is likely desirable.
> 
> Michael
> 
> On Mon, Oct 19, 2020 at 1:42 PM uyilmaz <uy...@vivaldi.net.invalid> wrote:
> >
> > Thanks! This also contributed to my confusion:
> >
> > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
> >
> > "If you want Solr to perform both analysis (for searching) and faceting on the full literal strings, use the copyField directive in your Schema to create two versions of the field: one Text and one String. Make sure both are indexed="true"."
> >
> > On Mon, 19 Oct 2020 13:08:00 -0400
> > Alexandre Rafalovitch <ar...@gmail.com> wrote:
> >
> > > I think this is all explained quite well in the Ref Guide:
> > > https://lucene.apache.org/solr/guide/8_6/docvalues.html
> > >
> > > DocValues is a different way to index/store values. Faceting is a
> > > primary use case where docValues are better than what 'indexed=true'
> > > gives you.
> > >
> > > Regards,
> > >    Alex.
> > >
> > > On Mon, 19 Oct 2020 at 12:51, uyilmaz <uy...@vivaldi.net.invalid> wrote:
> > > >
> > > >
> > > > Hey all,
> > > >
> > > > From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:
> > > >
> > > > <field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>
> > > >
> > > > I'm suprised by this, I thought I would need to index it. Can you confirm this?
> > > >
> > > > Regards
> > > >
> > > > --
> > > > uyilmaz <uy...@vivaldi.net>
> >
> >
> > --
> > uyilmaz <uy...@vivaldi.net>


-- 
uyilmaz <uy...@vivaldi.net>

Re: Faceting on indexed=false stored=false docValues=true fields

Posted by Michael Gibney <mi...@michaelgibney.net>.
As you've observed, it is indeed possible to facet on fields with
docValues=true, indexed=false; but in almost all cases you should
probably set indexed=true. 1. for distributed facet count refinement,
the "indexed" approach is used to look up counts by value; 2. assuming
you're wanting to do something usual, e.g. allow users to apply
filters based on facet counts, the filter application would use the
"indexed" approach as well. Where indexed=false, if either filtering
or distributed refinement is attempted, I'm not 100% sure what
happens. It might fail, or lead to inconsistent results, or attempt to
look up results via the equivalent of a "table scan" over docValues (I
think the last of these is what actually happens, fwiw) ... but none
of these options is likely desirable.

Michael

On Mon, Oct 19, 2020 at 1:42 PM uyilmaz <uy...@vivaldi.net.invalid> wrote:
>
> Thanks! This also contributed to my confusion:
>
> https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
>
> "If you want Solr to perform both analysis (for searching) and faceting on the full literal strings, use the copyField directive in your Schema to create two versions of the field: one Text and one String. Make sure both are indexed="true"."
>
> On Mon, 19 Oct 2020 13:08:00 -0400
> Alexandre Rafalovitch <ar...@gmail.com> wrote:
>
> > I think this is all explained quite well in the Ref Guide:
> > https://lucene.apache.org/solr/guide/8_6/docvalues.html
> >
> > DocValues is a different way to index/store values. Faceting is a
> > primary use case where docValues are better than what 'indexed=true'
> > gives you.
> >
> > Regards,
> >    Alex.
> >
> > On Mon, 19 Oct 2020 at 12:51, uyilmaz <uy...@vivaldi.net.invalid> wrote:
> > >
> > >
> > > Hey all,
> > >
> > > From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:
> > >
> > > <field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>
> > >
> > > I'm suprised by this, I thought I would need to index it. Can you confirm this?
> > >
> > > Regards
> > >
> > > --
> > > uyilmaz <uy...@vivaldi.net>
>
>
> --
> uyilmaz <uy...@vivaldi.net>

Re: Faceting on indexed=false stored=false docValues=true fields

Posted by uyilmaz <uy...@vivaldi.net.INVALID>.
Thanks! This also contributed to my confusion:

https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters

"If you want Solr to perform both analysis (for searching) and faceting on the full literal strings, use the copyField directive in your Schema to create two versions of the field: one Text and one String. Make sure both are indexed="true"."

On Mon, 19 Oct 2020 13:08:00 -0400
Alexandre Rafalovitch <ar...@gmail.com> wrote:

> I think this is all explained quite well in the Ref Guide:
> https://lucene.apache.org/solr/guide/8_6/docvalues.html
> 
> DocValues is a different way to index/store values. Faceting is a
> primary use case where docValues are better than what 'indexed=true'
> gives you.
> 
> Regards,
>    Alex.
> 
> On Mon, 19 Oct 2020 at 12:51, uyilmaz <uy...@vivaldi.net.invalid> wrote:
> >
> >
> > Hey all,
> >
> > From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:
> >
> > <field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>
> >
> > I'm suprised by this, I thought I would need to index it. Can you confirm this?
> >
> > Regards
> >
> > --
> > uyilmaz <uy...@vivaldi.net>


-- 
uyilmaz <uy...@vivaldi.net>

Re: Faceting on indexed=false stored=false docValues=true fields

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
I think this is all explained quite well in the Ref Guide:
https://lucene.apache.org/solr/guide/8_6/docvalues.html

DocValues is a different way to index/store values. Faceting is a
primary use case where docValues are better than what 'indexed=true'
gives you.

Regards,
   Alex.

On Mon, 19 Oct 2020 at 12:51, uyilmaz <uy...@vivaldi.net.invalid> wrote:
>
>
> Hey all,
>
> From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false:
>
> <field name="nonStorednonIndexedDocvalues" type="string" indexed="false" stored="false" docValues="true"/>
>
> I'm suprised by this, I thought I would need to index it. Can you confirm this?
>
> Regards
>
> --
> uyilmaz <uy...@vivaldi.net>