You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2008/12/02 19:46:33 UTC

[Solr Wiki] Update of "SimpleFacetParameters" by YonikSeeley

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by YonikSeeley:
http://wiki.apache.org/solr/SimpleFacetParameters

The comment on the change is:
document new facet.method param

------------------------------------------------------------------------------
  This parameter can be specified on a per field basis.
  
  <!> ["Solr1.2"]
+ 
+ == facet.method ==
+ 
+ This parameter indicates what type of algorithm/method to use when faceting a field.
+ 
+ The {{{enum}}} method was the default (and only) method prior to Solr1.4.  It enumerates all terms in a field, calculating the set intersection of documents that match the term with documents that match the query.
+ 
+ The {{{fc}}} method is the default in Solr1.4 since it tends to use less memory and is faster when a multi-valued field has many unique terms in the index.  The field is un-inverted and cached, much like the Lucene !FieldCache used to sort query results.  The facet counts are calculated by iterating over documents that match the query and summing the terms that appear in each document.
+ 
+ The default value is {{{fc}}}
+ 
+ <!> ["Solr1.4"]
  
  == Date Faceting Parameters ==
  

Re: [Solr Wiki] Update of "SimpleFacetParameters" by YonikSeeley

Posted by Chris Hostetter <ho...@fucit.org>.

: I've updated the wording - hopefully it's a little clearer now.

Yeah, but SolrFacetingOverview should probably be updated as well.

: counting up terms.  This did exist for single valued fields, and now
: also exists for multi-valued fields.  The implementation is different
: of course, but I don't think we need more control over that.  If
: anything the single-valued implementation could be made a little more
: efficient w.r.t memory use.

Hmmm... i think i understand: facet.method determines either that enum is 
used or an univerted field cache of some form (either the lucene 
FieldCache or the new solr UnInvertedField) is used -- and the specifics 
are determined by whether the field is multi-tokend?

If we do "optimize" the single-value case to use UnInvertedField instead 
we should definitely add an option for that, since some people may also be 
sorting on the field (and having an UnInvertedField in addition to a 
FieldCache could be considered a waste of ram)

-Hoss


Re: [Solr Wiki] Update of "SimpleFacetParameters" by YonikSeeley

Posted by Yonik Seeley <yo...@apache.org>.
On Tue, Dec 2, 2008 at 2:32 PM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : + This parameter indicates what type of algorithm/method to use when faceting a field.
> : +
> : + The {{{enum}}} method was the default (and only) method prior to Solr1.4.  It enumerates all terms in a field, calculating the set intersection of documents that match the term with documents that match the query.
>
> ...that's not strictly true though, there was a FieldCache based approach
> that was used for non-boolean single token fields ... has that now been
> completley eliminated? ... my naive reading of hte match is that it's
> still there, and still triggered in some cases -- should we make it an
> explicit option for this new facet.method param?

I've updated the wording - hopefully it's a little clearer now.

facet.method=enum means iterate over terms and calculate set intersections.
facet.method=fc means iterate over documents matching the query,
counting up terms.  This did exist for single valued fields, and now
also exists for multi-valued fields.  The implementation is different
of course, but I don't think we need more control over that.  If
anything the single-valued implementation could be made a little more
efficient w.r.t memory use.

-Yonik

Re: [Solr Wiki] Update of "SimpleFacetParameters" by YonikSeeley

Posted by Chris Hostetter <ho...@fucit.org>.
: + This parameter indicates what type of algorithm/method to use when faceting a field.
: + 
: + The {{{enum}}} method was the default (and only) method prior to Solr1.4.  It enumerates all terms in a field, calculating the set intersection of documents that match the term with documents that match the query.

...that's not strictly true though, there was a FieldCache based approach 
that was used for non-boolean single token fields ... has that now been 
completley eliminated? ... my naive reading of hte match is that it's 
still there, and still triggered in some cases -- should we make it an 
explicit option for this new facet.method param?


-Hoss