You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Yonik Seeley <yo...@apache.org> on 2008/12/11 19:24:57 UTC

support for multi-select facets

There are some categories of facets where it makes sense to allow the
selection of multiple values, and still show the counts (and the
ability to select) currently unselected values.

Here's a simple example with a multi-select facet "type", and a
traditional facet "author".

---------- type ----------
 x pdf (32)
    word (17)
 x html(46)
    excel(11)

-------- author --------
 erik (31)
 grant (27)
 yonik (14)

Currently, Solr doesn't support this well - all facets generated use
the same base doc set.

Here is what a request currently looks like:
q=foo&fq=date:[1 TO 2]&fq=securityfilter:42&fq=type:(pdf OR
html)&facet.field=author&facet.field=type

The problem of course is that the counts for "word" and "excel" would
come back as "0".  What is needed is to ignore any constraints on
"type" when faceting on that field.

Option #1: ability to specify the query/filters per-facet:
   f.type.facet.base=+foo +date:[1 TO 2] +securityfilter:42
     OR, specify all the parts as field-specific fqs for better caching
   f.type.facet.base=foo&f.type.facet.fq=date:[1 TO
2]&f.type.facet.fq=securityfilter:42
Downsides:
  - field-specific parameters don't work for facet queries, which may
also want this feature.
  - complex filters are repeated and re-parsed.

Option #2: ability to specify as a "local param" (meta-data on a parameter)
  facet.field={!base='f.type.facet.base=+foo +date:[1 TO 2]
+securityfilter:42'}type
Upsides:
  - can work for filter.query params
Downsides:
  - client needs to escape big query string
  - single "base" parameter not good for caching
  - complex filters are repeated and re-parsed.

Option #3: tag parts of a request using "local params"
q=foo&fq=date:[1 TO 2]&fq=securityfilter:42&fq={!tag=type}type:(pdf OR
html)&facet.field=type
    &facet.field={!exclude=type}author

So here, one fq is tagged with "type" {!tag=type}
and then excluded when faceting on author.
Upsides:
  - don't necessarily need to repeat and re-parse params since they
are referenced by name/tag.
  - tagging is a generic mechanism that can be used for other functionality.

Thoughts?

-Yonik

Re: support for multi-select facets

Posted by Yonik Seeley <ys...@gmail.com>.
On Tue, Dec 16, 2008 at 2:10 PM, Chris Hostetter
<ho...@fucit.org> wrote:
> Assuming i'm understand you (and your original example was just
> transcriptiong mistake)

Correct, it was a mistake.

> then i go back to the basic point of my original
> question about "huffman encoding the common case ...
> should we make "facet.field={!exclude}X" be shorthand for
> "facet.field={!exclude=X}X" ?

We need the latter for more complex scenarios (what is displayed as
one facet may not be), or for even faceting on the same field
different ways.

As for shorthand, {!exclude} is already shorthand for {!type=exclude}
in localParams syntax.

> Alternately: is seems likely that people who want this type of
> multi-select behavior will want it for all/most of their facets ...

But the relationship between a single "GUI" facet may not be
one-to-one with behind-the-scenes Solr facets.

> so
> perhaps instead of the {!exclude=X} local param, we should add a new
> facet.multi=(true|false) param that can be specified on a per-field basis
> ... so f.type.facet.multi=true means facet.field=type ignores any fq's
> where "tag=type".

That also wouldn't work for facet.query

> This would probably even be possible as syntactic sugar
> in addition to the {!exclude=X} syntax, so in the common case people would
> just use facet.multi=true and forget about it,

It's not that simple though, since they have to correctly tag all the filters.
And setting multi=true isn't really descriptive (as you've noted in
the terminology discussions).  Excluding certain filters when faceting
is very descriptive, and avoids any mention of what the GUI layer is
trying to accomplish.

-Yonik

> but in weird esoteric
> situations where they want filters on field X or field Y to be ignored by
> faceting on field Z they could still use "facet.field={!exclude=X,Y}Z"
>
>
> -Hoss

Re: support for multi-select facets

Posted by Chris Hostetter <ho...@fucit.org>.
: multi-select in a single request requires getting facets that are
: constrained differently per-facet (and may be constrained differently
: than for the top N docs returned.)

I still feel like we're having a huge disconcet in terminology, and that 
there are really two orthoginal issues here.

In all of the GUI lingo i've seen, the concept of "multi-select" can be 
related to a single GUI element (in the case of Solr: a single field).  
when talking about "multi-select faceting" (even for a single field) there 
are two radically differnet use cases:
  case#1: when the user selects a constraint, the main result set is 
filtered to only documents mathing that constraint and the counts for all 
other constraints are reduced accordingly.  when the user selects 
aditional constraints the main result set and the counts for other 
constraints are further refined and limited to the *intersection* of all 
selected constraints and the original document set.
  case#2: when the user selects a constraint, the main result set is
filtered to only documents mathing that constraint but the counts for all
other constraints remain unaffected.  when the user selects 
aditional constraints the main result set grows to include the 
intersection of the original document set with the *union* of all selected 
constraints.

case#1 is currently supported by Solr (for single value fields the use 
case is trivial, but Solr also handles the multivalued field use case as 
well).  case#2 is not currently possible without making two seperate 
requests (one with fq's constraining the selected constraints to get the 
main results; one w/o those fq's to get the facet counts)

that's what i think of (and what i suspect most people think of) when 
discussing multi-value faceting.

: The normal case would be that you only want to remove constraints
: related to what you are faceting on.
: So when faceting on type, disregard any type related filters.  When
: faceting on author, disregard any author related filters.  If I click
: on "word" docs above, I'd want to see all other constraint counts
: change except for those under 'type".

I *think* i understand what you just said, it sounds like you're 
describing the same thing i did in case#2 above but you are adding in 
another facet field (author) whose counts should be constrained in the 
same way as the main result set when a "type" constraint is selected.

My confusion is that what you just said don't seem to agree with your 
suggested syntax...

: Option #3: tag parts of a request using "local params"
: q=foo&fq=date:[1 TO 2]&fq=securityfilter:42&fq={!tag=type}type:(pdf OR
: html)&facet.field=type&facet.field={!exclude=type}author
: 
: So here, one fq is tagged with "type" {!tag=type}
: and then excluded when faceting on author.

...if the goal is that filtering on "type" causes the other facet 
counts (ie: "author") to be reduced to reflect the new constraint, but you 
don't want the type facet counts to change (so that the other type options 
don't vanish) then what is the "author" facet field excluding the taged 
"fq" ... shouldn't your example be...

  q=foo
  fq=date:[1 TO 2]
  fq=securityfilter:42
  fq={!tag=type}type:(pdf OR html)
  facet.field={!exclude=type}type
  facet.field=author

?

Assuming i'm understand you (and your original example was just 
transcriptiong mistake) then i go back to the basic point of my original 
question about "huffman encoding the common case ...  
should we make "facet.field={!exclude}X" be shorthand for 
"facet.field={!exclude=X}X" ?

Alternately: is seems likely that people who want this type of 
multi-select behavior will want it for all/most of their facets ... so 
perhaps instead of the {!exclude=X} local param, we should add a new 
facet.multi=(true|false) param that can be specified on a per-field basis 
... so f.type.facet.multi=true means facet.field=type ignores any fq's 
where "tag=type".  This would probably even be possible as syntactic sugar 
in addition to the {!exclude=X} syntax, so in the common case people would 
just use facet.multi=true and forget about it, but in weird esoteric 
situations where they want filters on field X or field Y to be ignored by 
faceting on field Z they could still use "facet.field={!exclude=X,Y}Z"


-Hoss


Re: support for multi-select facets

Posted by Yonik Seeley <ys...@gmail.com>.
On Tue, Dec 16, 2008 at 12:49 PM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : Subject: support for multi-select facets
>
> I'm confused by something ... is the issue here really "multi-select
> facets"? ... that can be dealt with "OR" queries.  What it seems you are
> trying to tackle isn't so much about UIs that want to allow a multi-select
> when faceting on a given field, but when the UI wants to display
> facet counts for one field which are not constrained by existing filters
> on another field.  correct?

multi-select in a single request requires getting facets that are
constrained differently per-facet (and may be constrained differently
than for the top N docs returned.)

> that seems orthoginal to doing "multi-select"
>
> : Option #1: ability to specify the query/filters per-facet:
>        ...
> : Option #2: ability to specify as a "local param" (meta-data on a parameter)
>        ...
> : Option #3: tag parts of a request using "local params"
>
> Wouldn't the simplest solution just be to have a new variant of the "fq"
> param that is utilized by the QueryComponent but ignored by the
> FacetComponent?

That would work for a single facet, but not for multiple facets.
Going back to the original example:

---------- type ----------
 x pdf (32)
   word (17)
 x html(46)
   excel(11)

-------- author --------
 erik (31)
 grant (27)
 yonik (14)


The normal case would be that you only want to remove constraints
related to what you are faceting on.
So when faceting on type, disregard any type related filters.  When
faceting on author, disregard any author related filters.  If I click
on "word" docs above, I'd want to see all other constraint counts
change except for those under 'type".

A simpler way to think about this multi-select scenario is at the GUI
level: you want faceting to work exactly as it did before, but you
don't want the multi-select facet to "disappear" when you click on one
of the items.

-Yonik

Re: support for multi-select facets

Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: support for multi-select facets

I'm confused by something ... is the issue here really "multi-select 
facets"? ... that can be dealt with "OR" queries.  What it seems you are 
trying to tackle isn't so much about UIs that want to allow a multi-select 
when faceting on a given field, but when the UI wants to display 
facet counts for one field which are not constrained by existing filters 
on another field.  correct?

that seems orthoginal to doing "multi-select"

: Option #1: ability to specify the query/filters per-facet:
	...
: Option #2: ability to specify as a "local param" (meta-data on a parameter)
	...
: Option #3: tag parts of a request using "local params"

Wouldn't the simplest solution just be to have a new variant of the "fq" 
param that is utilized by the QueryComponent but ignored by the 
FacetComponent?

Assume "rfq" is a "result fq" - affects the main result set, but not faceting...  

  q=foo
  fq=date:[1 TO 2]
  fq=securityfilter:42
  rfq=type:(pdf OR html)
  facet.field:type
  facet.field:author

...facet constraint counts are only bound by the "foo", "date" and 
"security".  main result is also limited by "type"

Your Option #3 seems to be an extension of this idea, (assuming i 
understand it correctly) using "fq={!tag=X}Y" instead of "rfq=Y" -- but 
also requires every facet.field to know about the "X" tag name .... 
wouldn't the common case be that you want all facet.fields to "exclude" 
all facet related fqs? ... should "facet.field={!ex}Y be shorthand for 
"facet.field={!ex=X1,X2,...XN}Y" where X1-XN are the full list of all 
known tags in "fq" params?


-Hoss


Re: support for multi-select facets

Posted by Yonik Seeley <yo...@apache.org>.
On Thu, Dec 11, 2008 at 2:54 PM, Shalin Shekhar Mangar
<sh...@gmail.com> wrote:
> On Thu, Dec 11, 2008 at 11:54 PM, Yonik Seeley <yo...@apache.org> wrote:
>
>>
>> Option #3: tag parts of a request using "local params"
>> q=foo&fq=date:[1 TO 2]&fq=securityfilter:42&fq={!tag=type}type:(pdf OR
>> html)&facet.field=type
>>    &facet.field={!exclude=type}author
>>
>> So here, one fq is tagged with "type" {!tag=type}
>> and then excluded when faceting on author.
>> Upsides:
>>  - don't necessarily need to repeat and re-parse params since they
>> are referenced by name/tag.
>>  - tagging is a generic mechanism that can be used for other functionality.
>>
>> Thoughts?
>
>
> I like this idea. A few questions:
>
> The tag is only used for the current request, right?

Right.

> How will this look when we want to exclude more than one filter? Will it be
> like fq={!exclude=filter1,filter2} ?

Yeah, exactly what I was thinking.

> Is this local param a syntax we are inventing or is it something which
> already exists?

Already exists, and is utilized in the QParser framework:
http://lucene.apache.org/solr/api/org/apache/solr/search/DisMaxQParserPlugin.html

-Yonik

Re: support for multi-select facets

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Thu, Dec 11, 2008 at 11:54 PM, Yonik Seeley <yo...@apache.org> wrote:

>
> Option #3: tag parts of a request using "local params"
> q=foo&fq=date:[1 TO 2]&fq=securityfilter:42&fq={!tag=type}type:(pdf OR
> html)&facet.field=type
>    &facet.field={!exclude=type}author
>
> So here, one fq is tagged with "type" {!tag=type}
> and then excluded when faceting on author.
> Upsides:
>  - don't necessarily need to repeat and re-parse params since they
> are referenced by name/tag.
>  - tagging is a generic mechanism that can be used for other functionality.
>
> Thoughts?


I like this idea. A few questions:

The tag is only used for the current request, right?

How will this look when we want to exclude more than one filter? Will it be
like fq={!exclude=filter1,filter2} ?

Is this local param a syntax we are inventing or is it something which
already exists?

-- 
Regards,
Shalin Shekhar Mangar.