You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Wei <we...@gmail.com> on 2018/01/30 23:27:09 UTC

facet.method=uif not working in solr cloud?

Hi,

I am using the following parameters for faceting, request solr to use the
UIF method;

&facet=on&facet.field=color&q=*:*&facet.method=uif&facet.mincount=1&debugQuery=true

It works as expected in my local standalone solr:


   - facet-debug:
   {
      - elapse: 2,
      - sub-facet:
      [
         -
         {
            - processor: "SimpleFacets",
            - elapse: 2,
            - action: "field facet",
            - maxThreads: 0,
            - sub-facet:
            [
               -
               {
                  - elapse: 2,
                  - requestedMethod: "UIF",
                  - appliedMethod: "UIF",
                  - inputDocSetSize: 8191,
                  - field: "color"
                  }
               ]
            }
         ]
      },


However when I apply the same query to solr cloud with multiple shards, the
appliedMethod is alway FC instead of UIF:

{

   - processor: "SimpleFacets",
   - elapse: 18,
   - action: "field facet",
   - maxThreads: 0,
   - sub-facet:
   [
      -
      {
         - elapse: 58,
         - requestedMethod: "UIF",
         - appliedMethod: "FC",
         - inputDocSetSize: 33487,
         - field: "color",
         - numBuckets: 238
         }
      ]

}

I also see that in standalone mode fieldValueCache is used with UIF
applied, but in cloud mode fieldValueCache is always empty.  Are there any
other parameters I need to apply UIF faceting in solr cloud?

Thanks,
Wei

Re: facet.method=uif not working in solr cloud?

Posted by Yonik Seeley <ys...@gmail.com>.
On Wed, Feb 14, 2018 at 7:24 PM, Wei <we...@gmail.com> wrote:
> Thanks Yonik. If uif has big upfront cost when hits solr the first time,
> in solr cloud the same faceting request could hit different replicas in the
> same shard, so that cost will happen at least for the number of replicas?
> If we are doing frequent auto commits, fieldvaluecache will be invalidated
> and uif will have to pay the upfront cost again after each commit?

Right.  It's not good for frequently changing indexes.

-Yonik

>
>
> On Wed, Feb 14, 2018 at 11:51 AM, Yonik Seeley <ys...@gmail.com> wrote:
>
>> On Wed, Feb 14, 2018 at 2:28 PM, Wei <we...@gmail.com> wrote:
>> > Thanks all!   It's really great learning.  A bit off the topic, after I
>> > enabled facet.method = uif in solr cloud,  the faceting performance is
>> > actually much worse than the original fc( ~1000 ms with uif  vs ~200 ms
>> > with fc). My cloud has 8 shards with 6 replicas in each shard.  I do see
>> > that fieldValueCache is getting utilized.  Any reason uif could be so
>> > slow?
>>
>> I haven't seen that before.  Are you sure it's not the first time
>> faceting on a field?  uif has big upfront cost, but is usually faster
>> once that cost has been paid.
>>
>>
>> -Yonik
>>
>> > On Tue, Feb 13, 2018 at 7:41 AM, Yonik Seeley <ys...@gmail.com> wrote:
>> >
>> >> Great, thanks for tracking that down!
>> >> It's interesting that a mincount of 0 disables uif processing in the
>> >> first place.  IIRC, it's only the hash-based method (as opposed to
>> >> array-based) that can't return zero counts.
>> >>
>> >> -Yonik
>> >>
>> >>
>> >> On Tue, Feb 13, 2018 at 6:17 AM, Alessandro Benedetti
>> >> <a....@sease.io> wrote:
>> >> > *Update* : This has been actually already solved by Hoss.
>> >> >
>> >> > https://issues.apache.org/jira/browse/SOLR-11711 and this is the Pull
>> >> > Request : https://github.com/apache/lucene-solr/pull/279/files
>> >> >
>> >> > This should go live with 7.3
>> >> >
>> >> > Cheers
>> >> >
>> >> >
>> >> >
>> >> > -----
>> >> > ---------------
>> >> > Alessandro Benedetti
>> >> > Search Consultant, R&D Software Engineer, Director
>> >> > Sease Ltd. - www.sease.io
>> >> > --
>> >> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>> >>
>>

Re: facet.method=uif not working in solr cloud?

Posted by Wei <we...@gmail.com>.
Thanks Yonik. If uif has big upfront cost when hits solr the first time,
in solr cloud the same faceting request could hit different replicas in the
same shard, so that cost will happen at least for the number of replicas?
If we are doing frequent auto commits, fieldvaluecache will be invalidated
and uif will have to pay the upfront cost again after each commit?



On Wed, Feb 14, 2018 at 11:51 AM, Yonik Seeley <ys...@gmail.com> wrote:

> On Wed, Feb 14, 2018 at 2:28 PM, Wei <we...@gmail.com> wrote:
> > Thanks all!   It's really great learning.  A bit off the topic, after I
> > enabled facet.method = uif in solr cloud,  the faceting performance is
> > actually much worse than the original fc( ~1000 ms with uif  vs ~200 ms
> > with fc). My cloud has 8 shards with 6 replicas in each shard.  I do see
> > that fieldValueCache is getting utilized.  Any reason uif could be so
> > slow?
>
> I haven't seen that before.  Are you sure it's not the first time
> faceting on a field?  uif has big upfront cost, but is usually faster
> once that cost has been paid.
>
>
> -Yonik
>
> > On Tue, Feb 13, 2018 at 7:41 AM, Yonik Seeley <ys...@gmail.com> wrote:
> >
> >> Great, thanks for tracking that down!
> >> It's interesting that a mincount of 0 disables uif processing in the
> >> first place.  IIRC, it's only the hash-based method (as opposed to
> >> array-based) that can't return zero counts.
> >>
> >> -Yonik
> >>
> >>
> >> On Tue, Feb 13, 2018 at 6:17 AM, Alessandro Benedetti
> >> <a....@sease.io> wrote:
> >> > *Update* : This has been actually already solved by Hoss.
> >> >
> >> > https://issues.apache.org/jira/browse/SOLR-11711 and this is the Pull
> >> > Request : https://github.com/apache/lucene-solr/pull/279/files
> >> >
> >> > This should go live with 7.3
> >> >
> >> > Cheers
> >> >
> >> >
> >> >
> >> > -----
> >> > ---------------
> >> > Alessandro Benedetti
> >> > Search Consultant, R&D Software Engineer, Director
> >> > Sease Ltd. - www.sease.io
> >> > --
> >> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >>
>

Re: facet.method=uif not working in solr cloud?

Posted by Yonik Seeley <ys...@gmail.com>.
On Wed, Feb 14, 2018 at 2:28 PM, Wei <we...@gmail.com> wrote:
> Thanks all!   It's really great learning.  A bit off the topic, after I
> enabled facet.method = uif in solr cloud,  the faceting performance is
> actually much worse than the original fc( ~1000 ms with uif  vs ~200 ms
> with fc). My cloud has 8 shards with 6 replicas in each shard.  I do see
> that fieldValueCache is getting utilized.  Any reason uif could be so
> slow?

I haven't seen that before.  Are you sure it's not the first time
faceting on a field?  uif has big upfront cost, but is usually faster
once that cost has been paid.


-Yonik

> On Tue, Feb 13, 2018 at 7:41 AM, Yonik Seeley <ys...@gmail.com> wrote:
>
>> Great, thanks for tracking that down!
>> It's interesting that a mincount of 0 disables uif processing in the
>> first place.  IIRC, it's only the hash-based method (as opposed to
>> array-based) that can't return zero counts.
>>
>> -Yonik
>>
>>
>> On Tue, Feb 13, 2018 at 6:17 AM, Alessandro Benedetti
>> <a....@sease.io> wrote:
>> > *Update* : This has been actually already solved by Hoss.
>> >
>> > https://issues.apache.org/jira/browse/SOLR-11711 and this is the Pull
>> > Request : https://github.com/apache/lucene-solr/pull/279/files
>> >
>> > This should go live with 7.3
>> >
>> > Cheers
>> >
>> >
>> >
>> > -----
>> > ---------------
>> > Alessandro Benedetti
>> > Search Consultant, R&D Software Engineer, Director
>> > Sease Ltd. - www.sease.io
>> > --
>> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>

Re: facet.method=uif not working in solr cloud?

Posted by Wei <we...@gmail.com>.
Thanks all!   It's really great learning.  A bit off the topic, after I
enabled facet.method = uif in solr cloud,  the faceting performance is
actually much worse than the original fc( ~1000 ms with uif  vs ~200 ms
with fc). My cloud has 8 shards with 6 replicas in each shard.  I do see
that fieldValueCache is getting utilized.  Any reason uif could be so
slow?

On Tue, Feb 13, 2018 at 7:41 AM, Yonik Seeley <ys...@gmail.com> wrote:

> Great, thanks for tracking that down!
> It's interesting that a mincount of 0 disables uif processing in the
> first place.  IIRC, it's only the hash-based method (as opposed to
> array-based) that can't return zero counts.
>
> -Yonik
>
>
> On Tue, Feb 13, 2018 at 6:17 AM, Alessandro Benedetti
> <a....@sease.io> wrote:
> > *Update* : This has been actually already solved by Hoss.
> >
> > https://issues.apache.org/jira/browse/SOLR-11711 and this is the Pull
> > Request : https://github.com/apache/lucene-solr/pull/279/files
> >
> > This should go live with 7.3
> >
> > Cheers
> >
> >
> >
> > -----
> > ---------------
> > Alessandro Benedetti
> > Search Consultant, R&D Software Engineer, Director
> > Sease Ltd. - www.sease.io
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: facet.method=uif not working in solr cloud?

Posted by Yonik Seeley <ys...@gmail.com>.
Great, thanks for tracking that down!
It's interesting that a mincount of 0 disables uif processing in the
first place.  IIRC, it's only the hash-based method (as opposed to
array-based) that can't return zero counts.

-Yonik


On Tue, Feb 13, 2018 at 6:17 AM, Alessandro Benedetti
<a....@sease.io> wrote:
> *Update* : This has been actually already solved by Hoss.
>
> https://issues.apache.org/jira/browse/SOLR-11711 and this is the Pull
> Request : https://github.com/apache/lucene-solr/pull/279/files
>
> This should go live with 7.3
>
> Cheers
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: facet.method=uif not working in solr cloud?

Posted by Alessandro Benedetti <a....@sease.io>.
*Update* : This has been actually already solved by Hoss.

https://issues.apache.org/jira/browse/SOLR-11711 and this is the Pull
Request : https://github.com/apache/lucene-solr/pull/279/files

This should go live with 7.3 

Cheers



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: facet.method=uif not working in solr cloud?

Posted by Alessandro Benedetti <a....@sease.io>.
+1

I believe it is a bug related to that patch in some way.
facet.distrib.mco ( the naming is not very explicit) should activate the
feature in the patch, which forces the mincount in the distributed requests
to be set to 1.

The normal behavior expected is that you pass to the distributed requests
the same value for the parameter that you originally set.

Can you open a bug Wei ?
We can investigate the part where the requests are distributed.

Regards



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: facet.method=uif not working in solr cloud?

Posted by Yonik Seeley <ys...@gmail.com>.
Feels like we should open an issue for this (that facet.method=uif is
only respected if you specify another esoteric parameter...)

-Yonik


On Mon, Feb 12, 2018 at 8:34 PM, Wei <we...@gmail.com> wrote:
> Adding facet.distrib.mco=true did the trick.  Thanks Toke and Alessandro!
>
> Cheers,
> Wei
>
> On Thu, Feb 8, 2018 at 1:23 AM, Toke Eskildsen <to...@kb.dk> wrote:
>
>> On Fri, 2018-02-02 at 17:40 -0800, Wei wrote:
>> > I tried to debug a bit and see that when executing on a cloud solr
>> > server, although I put
>> > facet.field=color&q=*:*&facet.method=uif&facet.mincount=1 in
>> > the request url, at the point it reaches SimpleFacet inside
>> > req.params it somehow has been rewritten
>> > to  f.color.facet.mincount=0, no wonder the
>> > method chosen become FC. So one myth solved; but the new myth is why
>> > the facet.mincount is override to 0 in solr req?
>>
>> AFAIK, it is due to an attempt of optimisation for distributed
>> faceting. The relevant JIRA seems to be https://issues.apache.org/jira/
>> browse/SOLR-8988
>>
>> Try setting facet.distrib.mco=true
>>
>> - Toke Eskildsen, Royal Danish Library
>>
>>

Re: facet.method=uif not working in solr cloud?

Posted by Wei <we...@gmail.com>.
Adding facet.distrib.mco=true did the trick.  Thanks Toke and Alessandro!

Cheers,
Wei

On Thu, Feb 8, 2018 at 1:23 AM, Toke Eskildsen <to...@kb.dk> wrote:

> On Fri, 2018-02-02 at 17:40 -0800, Wei wrote:
> > I tried to debug a bit and see that when executing on a cloud solr
> > server, although I put
> > facet.field=color&q=*:*&facet.method=uif&facet.mincount=1 in
> > the request url, at the point it reaches SimpleFacet inside
> > req.params it somehow has been rewritten
> > to  f.color.facet.mincount=0, no wonder the
> > method chosen become FC. So one myth solved; but the new myth is why
> > the facet.mincount is override to 0 in solr req?
>
> AFAIK, it is due to an attempt of optimisation for distributed
> faceting. The relevant JIRA seems to be https://issues.apache.org/jira/
> browse/SOLR-8988
>
> Try setting facet.distrib.mco=true
>
> - Toke Eskildsen, Royal Danish Library
>
>

Re: facet.method=uif not working in solr cloud?

Posted by Toke Eskildsen <to...@kb.dk>.
On Fri, 2018-02-02 at 17:40 -0800, Wei wrote:
> I tried to debug a bit and see that when executing on a cloud solr
> server, although I put
> facet.field=color&q=*:*&facet.method=uif&facet.mincount=1 in
> the request url, at the point it reaches SimpleFacet inside
> req.params it somehow has been rewritten
> to  f.color.facet.mincount=0, no wonder the
> method chosen become FC. So one myth solved; but the new myth is why
> the facet.mincount is override to 0 in solr req?

AFAIK, it is due to an attempt of optimisation for distributed
faceting. The relevant JIRA seems to be https://issues.apache.org/jira/
browse/SOLR-8988

Try setting facet.distrib.mco=true

- Toke Eskildsen, Royal Danish Library


Re: facet.method=uif not working in solr cloud?

Posted by Alessandro Benedetti <a....@sease.io>.
What happens if you set f.color.facet.mincount=1 from the beginning ?
Are you faceting on multiple fields ?

Good news we found out the first mistery though ! :)

Cheers



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: facet.method=uif not working in solr cloud?

Posted by Wei <we...@gmail.com>.
I tried to debug a bit and see that when executing on a cloud solr server,
although I put facet.field=color&q=*:*&facet.method=uif&facet.mincount=1 in
the request url, at the point it reaches SimpleFacet inside req.params it
somehow has been rewritten to  f.color.facet.mincount=0, no wonder the
method chosen become FC. So one myth solved; but the new myth is why the
facet.mincount is override to 0 in solr req?

Cheers,
Wei

On Thu, Feb 1, 2018 at 2:01 AM, Alessandro Benedetti <a....@sease.io>
wrote:

> " Looks like when using the json facet api,
> SimpleFacets is not used, replaced by FacetFieldPorcessorByArrayUIF "
>
> That is expected, I remember Yonik to stress the fact that it is a
> completely different approach to faceting ( and different components and
> classes are involved).
>
> But your first case, it may be worth an investigation.
> If you have the tools and you are used to it I would encourage you to
> reproduce the issue and remote debug it from a Solr server.
> Putting a breakpoint in the Simple Facets method you should be able to
> solve
> the mystery ( a bug maybe ? I am very curious about it. )
>
> Cheers
>
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: facet.method=uif not working in solr cloud?

Posted by Alessandro Benedetti <a....@sease.io>.
" Looks like when using the json facet api, 
SimpleFacets is not used, replaced by FacetFieldPorcessorByArrayUIF "

That is expected, I remember Yonik to stress the fact that it is a
completely different approach to faceting ( and different components and
classes are involved).

But your first case, it may be worth an investigation.
If you have the tools and you are used to it I would encourage you to
reproduce the issue and remote debug it from a Solr server.
Putting a breakpoint in the Simple Facets method you should be able to solve
the mystery ( a bug maybe ? I am very curious about it. )

Cheers




-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: facet.method=uif not working in solr cloud?

Posted by Wei <we...@gmail.com>.
Thanks Alessandro. Totally agree that from the logic I can't see why the
requested facet.method=uif is not accepted. I don't see anything in
solr.log also.  However I find that the uif method somehow works with json
facet api in cloud mode,  e.g:

curl http://mysolrcloud:8983/solr/mycollection/select -d
'q=*:*&wt=json&rows=0&json.facet={color: {type: terms, field : color,
method : uif, limit:1000, mincount:1}}&debugQuery=true'

Then in the debug response I see:

"facet-trace":{

   - "processor":"FacetQueryProcessor",
   - "elapse":453,
   - "query":null,
   - "domainSize":70215,
   - "sub-facet":[
      1. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":1,
         - "field":"color",
         - "limit":1000,
         - "numBuckets":20,
         - "domainSize":7166
      },
      2. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":1,
         - "field":"color",
         - "limit":1000
         - "numBuckets":19,
         - "domainSize":7004
      },
      3. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":2,
         - "field":"color",
         - "limit":1000,
         - "numBuckets":20,
         - "domainSize":7030
      },
      4. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":80,
         - "field":"color",
         - "limit":1000,
         - "numBuckets":20,
         - "domainSize":6969
      },
      5. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":85,
         - "field":"color",
         - "limit":1000,
         - "numBuckets":20,
         - "domainSize":6953
      },
      6. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":85,
         - "field":"color",
         - "limit":1000,
         - "numBuckets":20,
         - "domainSize":6901
      },
      7. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":93,
         - "field":"color",
         - "limit":1000,
         - "numBuckets":20,
         - "domainSize":6951
      },
      8. {
         - "processor":"FacetFieldProcessorByArrayUIF",
         - "elapse":104,
         - "field":"color",
         - "limit":1000,
         - "numBuckets":19,
         - "domainSize":7127
      }
   ]

A few things puzzled me here.  Looks like when using the json facet api,
SimpleFacets is not used, replaced by FacetFieldPorcessorByArrayUIF
processor. Is that the expected behavior? Also with uif method applied,
facet latency is greatly increased.  Some shards have much bigger elapse
time reported ( 104 vs 1),  I wonder what could cause the discrepancy as my
index in different shards are evenly distributed.

Thanks,
Wei


On Wed, Jan 31, 2018 at 2:24 AM, Alessandro Benedetti <a....@sease.io>
wrote:

> I worked personally on the SimpleFacets class which does the facet method
> selection :
>
> FacetMethod appliedFacetMethod = selectFacetMethod(field,
>                                 sf, requestedMethod, mincount,
>                                 exists);
>
>     RTimer timer = null;
>     if (fdebug != null) {
>        fdebug.putInfoItem("requestedMethod", requestedMethod==null?"not
> specified":requestedMethod.name());
>        fdebug.putInfoItem("appliedMethod", appliedFacetMethod.name());
>        fdebug.putInfoItem("inputDocSetSize", docs.size());
>        fdebug.putInfoItem("field", field);
>        timer = new RTimer();
>     }
>
> Within the select facet method , the only code block related UIF is (
> another block can apply when facet method arrives null to the Solr Node,
> but
> that should not apply as we see the facet method in the debug):
>
> /* UIF without DocValues can't deal with mincount=0, the reason is because
>          we create the buckets based on the values present in the result
> set.
>          So we are not going to see facet values which are not in the
> result
> set */
>      if (method == FacetMethod.UIF
>          && !field.hasDocValues() && mincount == 0) {
>        method = field.multiValued() ? FacetMethod.FC : FacetMethod.FCS;
>      }
>
> So is there anything in the logs?
> Because that seems to me the only point where you can change from UIF to FC
> and you clearly have mincount=1.
>
>
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: facet.method=uif not working in solr cloud?

Posted by Alessandro Benedetti <a....@sease.io>.
I worked personally on the SimpleFacets class which does the facet method
selection :

FacetMethod appliedFacetMethod = selectFacetMethod(field,
                                sf, requestedMethod, mincount,
                                exists);

    RTimer timer = null;
    if (fdebug != null) {
       fdebug.putInfoItem("requestedMethod", requestedMethod==null?"not
specified":requestedMethod.name());
       fdebug.putInfoItem("appliedMethod", appliedFacetMethod.name());
       fdebug.putInfoItem("inputDocSetSize", docs.size());
       fdebug.putInfoItem("field", field);
       timer = new RTimer();
    }

Within the select facet method , the only code block related UIF is (
another block can apply when facet method arrives null to the Solr Node, but
that should not apply as we see the facet method in the debug):

/* UIF without DocValues can't deal with mincount=0, the reason is because
         we create the buckets based on the values present in the result
set.
         So we are not going to see facet values which are not in the result
set */
     if (method == FacetMethod.UIF
         && !field.hasDocValues() && mincount == 0) {
       method = field.multiValued() ? FacetMethod.FC : FacetMethod.FCS;
     }

So is there anything in the logs?
Because that seems to me the only point where you can change from UIF to FC
and you clearly have mincount=1.





-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html