You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jason Gerlowski <ge...@gmail.com> on 2020/11/17 13:21:19 UTC

Faceting: !terms vs mincount precedence

Hey all,

I was using the {!terms} local parameter on some traditional field
facets to make sure particular values were returned.

e.g. facet=true&facet.field={!terms='fantasy,scifi,mystery'}genre_s&f.genre_s.facet.mincount=2

On single-shard collections in 8.6.3 this worked as I expected -
"fantasy", "scifi", and "mystery" were the only 3 field values
returned, and "mystery" was returned despite its count value being
less than the specified "mincount".  But on a multi-shard collection
"mystery" isn't returned (presumably because a "mincount" check
filters out the values on the facet aggregator node).

What are the expected semantics when "{!terms}" and "mincount" are
used together?  Should mincount filter out values in {!terms}, or
should those values be excluded from any mincount filtering?  The
behavior is clearly inconsistent between single and multi-shard, so it
deserves a JIRA either way.  Just trying to figure out what the
expected behavior is.

Best,

Jason

Re: Faceting: !terms vs mincount precedence

Posted by Jason Gerlowski <ge...@gmail.com>.
Thanks for the context David - I didn't realize this was built as an
internal mechanism and then documented later on.  A few other thoughts
below:

> {!terms}, it suggests a reference to the TermsQParser, but when you write {!terms=a,b,c} it suggests local-params
I agree that the two are easy to confuse.  Apologies for abbreviating
it at points in my earlier email - I was doing it for brevity and
didn't intend the confusion.

> I think that "terms" local-param to faceting was a purely internal thing that wasn't documented
That may be.  But I disagree that it shouldn't've been documented in
the first place.  Digging into this has cost me a good bit of time,
and even now maybe I've got more digging to do, maybe a bug to fix,
etc.  But without someone's (Christine's?) documentation I'd be even
worse off, without any idea that this "terms" local-params support
exists at all.  The documentation even mentions that "terms" doesn't
work well with some other faceting params.  The details could be a bit
fuller, but the warning *is* there.  So I don't find any fault with
documenting this sort of stuff - especially when it gives warnings
about potential limitations.

Anyway, still hoping someone else might chime in with a slick
workaround or something.  But it does look at this point like I'll
have to go another route or put in some effort myself.

Jason

On Tue, Nov 17, 2020 at 3:41 PM David Smiley <ds...@apache.org> wrote:
>
> This is confusing because when you write {!terms}, it suggests a reference
> to the TermsQParser, but when you write {!terms=a,b,c} it suggests
> local-params, with key "terms" and value "a,b,c" -- entirely different
> things.  I think that "terms" local-param to faceting was a purely internal
> thing that wasn't documented; it existed as an internal implementation
> detail.  Then someone (I think Christine, if not then Mikhail) observed it
> wasn't documented, and added some basic docs.  Now you come along and try
> to use it with other things that unsurprisingly it just wasn't designed
> for.  That's my estimation of the matter... and *if* true, illustrates that
> maybe some internal params should stay internal and don't need to be
> publicly documented.  I confess I've used that faceting local-param in an
> app once before too; it's useful.  I know my response isn't a direct answer
> to your question RE mincount... perhaps it can be made to work?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Nov 17, 2020 at 8:21 AM Jason Gerlowski <ge...@gmail.com>
> wrote:
>
> > Hey all,
> >
> > I was using the {!terms} local parameter on some traditional field
> > facets to make sure particular values were returned.
> >
> > e.g.
> > facet=true&facet.field={!terms='fantasy,scifi,mystery'}genre_s&f.genre_s.facet.mincount=2
> >
> > On single-shard collections in 8.6.3 this worked as I expected -
> > "fantasy", "scifi", and "mystery" were the only 3 field values
> > returned, and "mystery" was returned despite its count value being
> > less than the specified "mincount".  But on a multi-shard collection
> > "mystery" isn't returned (presumably because a "mincount" check
> > filters out the values on the facet aggregator node).
> >
> > What are the expected semantics when "{!terms}" and "mincount" are
> > used together?  Should mincount filter out values in {!terms}, or
> > should those values be excluded from any mincount filtering?  The
> > behavior is clearly inconsistent between single and multi-shard, so it
> > deserves a JIRA either way.  Just trying to figure out what the
> > expected behavior is.
> >
> > Best,
> >
> > Jason
> >

Re: Faceting: !terms vs mincount precedence

Posted by David Smiley <ds...@apache.org>.
This is confusing because when you write {!terms}, it suggests a reference
to the TermsQParser, but when you write {!terms=a,b,c} it suggests
local-params, with key "terms" and value "a,b,c" -- entirely different
things.  I think that "terms" local-param to faceting was a purely internal
thing that wasn't documented; it existed as an internal implementation
detail.  Then someone (I think Christine, if not then Mikhail) observed it
wasn't documented, and added some basic docs.  Now you come along and try
to use it with other things that unsurprisingly it just wasn't designed
for.  That's my estimation of the matter... and *if* true, illustrates that
maybe some internal params should stay internal and don't need to be
publicly documented.  I confess I've used that faceting local-param in an
app once before too; it's useful.  I know my response isn't a direct answer
to your question RE mincount... perhaps it can be made to work?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Nov 17, 2020 at 8:21 AM Jason Gerlowski <ge...@gmail.com>
wrote:

> Hey all,
>
> I was using the {!terms} local parameter on some traditional field
> facets to make sure particular values were returned.
>
> e.g.
> facet=true&facet.field={!terms='fantasy,scifi,mystery'}genre_s&f.genre_s.facet.mincount=2
>
> On single-shard collections in 8.6.3 this worked as I expected -
> "fantasy", "scifi", and "mystery" were the only 3 field values
> returned, and "mystery" was returned despite its count value being
> less than the specified "mincount".  But on a multi-shard collection
> "mystery" isn't returned (presumably because a "mincount" check
> filters out the values on the facet aggregator node).
>
> What are the expected semantics when "{!terms}" and "mincount" are
> used together?  Should mincount filter out values in {!terms}, or
> should those values be excluded from any mincount filtering?  The
> behavior is clearly inconsistent between single and multi-shard, so it
> deserves a JIRA either way.  Just trying to figure out what the
> expected behavior is.
>
> Best,
>
> Jason
>