You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Greg Miller (Jira)" <ji...@apache.org> on 2021/05/03 21:21:00 UTC

[jira] [Updated] (LUCENE-9945) Extend DrillSideways to support exposing FacetCollectors directly

     [ https://issues.apache.org/jira/browse/LUCENE-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Greg Miller updated LUCENE-9945:
--------------------------------
    Description: 
The {{DrillSideways}} logic currently encapsulates, 1) the creation of multiple {{FacetsCollector}} instances, and 2) the processing of those {{FacetsCollectors into a single Facets}} instance. While I suspect this works well for most common cases, and is simple to understand, it's difficult to extend to more advanced cases.

I propose extending {{DrillSideways}} to support exposing the underlying {{FacetsCollector}} instances if the user needs them, in addition to maintaining the current functionality for all of the more common cases. Specifically, I'd like to add both the "drill down" {{FacetsCollector}} and map of dim -> {{FacetsCollector}} for "drill sideways" to the {{DrillSidewaysResult}} and {{ConcurrentDrillSidewaysResult}} classes. While it's true that a user can extend {{DrillSideways}} and override {{buildFacetsResult}} to keep track of these, it seems reasonable to provide this in {{DrillSideways}} itself so users don't need to sub-class for only this purpose.

Here are two use-cases illustrating the desire for this:
 # For certain cases, instead of actually creating a {{Facets}} instance from the {{FacetsCollector}}, I'd like to intersect the {{FacetsCollector}} matching docs with a {{TermEnum}} to do determine whether-or-not at least one match exists. This is useful for a "facet" that can only take on the values "true"/"false", and I want to make sure at least one hit has the value "true".
 # In another case, I only care about some aggregate statistics for a given facet field. For example, I want to find the min and max values. For this, I want to intersect some doc value field with a {{FacetsCollector}} and only track the min/max values I observe while iterating.

  was:
The {{DrillSideways}} logic currently encapsulates, 1) the creation of multiple {{FacetsCollector}} instances, and 2) the processing of those {{FacetsCollector}}s into a single {{Facets}} instance. While I suspect this works well for most common cases, and is simple to understand, it's difficult to extend to more advanced cases.

I propose extending {{DrillSideways}} to support exposing the underlying {{FacetsCollector}} instances if the user needs them, in addition to maintaining the current functionality for all of the more common cases. Specifically, I'd like to add both the "drill down" {{FacetsCollector}} and map of dim -> {{FacetsCollector}} for "drill sideways" to the {{DrillSidewaysResult}} and {{ConcurrentDrillSidewaysResult}} classes. While it's true that a user can extend {{DrillSideways}} and override {{buildFacetsResult}} to keep track of these, it seems reasonable to provide this in {{DrillSideways}} itself so users don't need to sub-class for only this purpose.

Here are two use-cases illustrating the desire for this:
# For certain cases, instead of actually creating a {{Facets}} instance from the {{FacetsCollector}}, I'd like to intersect the {{FacetsCollector}} matching docs with a {{TermEnum}} to do determine whether-or-not at least one match exists. This is useful for a "facet" that can only take on the values "true"/"false", and I want to make sure at least one hit has the value "true".
# In another case, I only care about some aggregate statistics for a given facet field. For example, I want to find the min and max values. For this, I want to intersect some doc value field with a {{FacetsCollector}} and only track the min/max values I observe while iterating.


> Extend DrillSideways to support exposing FacetCollectors directly
> -----------------------------------------------------------------
>
>                 Key: LUCENE-9945
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9945
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>    Affects Versions: main (9.0)
>            Reporter: Greg Miller
>            Priority: Minor
>
> The {{DrillSideways}} logic currently encapsulates, 1) the creation of multiple {{FacetsCollector}} instances, and 2) the processing of those {{FacetsCollectors into a single Facets}} instance. While I suspect this works well for most common cases, and is simple to understand, it's difficult to extend to more advanced cases.
> I propose extending {{DrillSideways}} to support exposing the underlying {{FacetsCollector}} instances if the user needs them, in addition to maintaining the current functionality for all of the more common cases. Specifically, I'd like to add both the "drill down" {{FacetsCollector}} and map of dim -> {{FacetsCollector}} for "drill sideways" to the {{DrillSidewaysResult}} and {{ConcurrentDrillSidewaysResult}} classes. While it's true that a user can extend {{DrillSideways}} and override {{buildFacetsResult}} to keep track of these, it seems reasonable to provide this in {{DrillSideways}} itself so users don't need to sub-class for only this purpose.
> Here are two use-cases illustrating the desire for this:
>  # For certain cases, instead of actually creating a {{Facets}} instance from the {{FacetsCollector}}, I'd like to intersect the {{FacetsCollector}} matching docs with a {{TermEnum}} to do determine whether-or-not at least one match exists. This is useful for a "facet" that can only take on the values "true"/"false", and I want to make sure at least one hit has the value "true".
>  # In another case, I only care about some aggregate statistics for a given facet field. For example, I want to find the min and max values. For this, I want to intersect some doc value field with a {{FacetsCollector}} and only track the min/max values I observe while iterating.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org