You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2013/11/07 17:41:21 UTC

[jira] [Created] (LUCENE-5333) Support sparse faceting for heterogeneous indices

Michael McCandless created LUCENE-5333:
------------------------------------------

             Summary: Support sparse faceting for heterogeneous indices
                 Key: LUCENE-5333
                 URL: https://issues.apache.org/jira/browse/LUCENE-5333
             Project: Lucene - Core
          Issue Type: New Feature
          Components: modules/facet
            Reporter: Michael McCandless


In some search apps, e.g. a large e-commerce site, the index can have
a mix of wildly different product categories and facet dimensions, and
the number of dimensions could be huge.

E.g. maybe the index has shirts, computer memory, hard drives, etc.,
and each of these many categories has different attributes.

In such an index, when someone searches for "so dimm", which should
match a bunch of laptop memory modules, you can't (easily) know up
front which facet dimensions will be important.

But, I think this is very easy for the facet module, since ords are
stored "row stride" (each doc lists all facet labels it has), we could
simply count all facets that the hits actually saw, and then in the
end see which ones "got traction" and return facet results for these
top dims.

I'm not sure what the API would look like, but conceptually this
should work very well, because of how the facet module works.
You shouldn't have to state up front exactly which facet dimensions
to count...




--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org