You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/06/03 07:55:57 UTC

[GitHub] [lucene] gsmiller commented on pull request #841: LUCENE-10274: Add hyperrectangle faceting capabilities

gsmiller commented on PR #841:
URL: https://github.com/apache/lucene/pull/841#issuecomment-1145696479

   Trying to catch up on this now. I've been traveling and it's been difficult to find time. Thanks for all your thoughts @shaie!
   
   I think I'm only half-following your thoughts on the different APIs necessary, and will probably need to look at what you've documented in more detail. But... as a half-baked response, I'm not convinced (yet?) that we need this level of complexity in the API. In my mind, what we're trying to build is a generalization of what is already supported in long/double-range faceting (e.g., `LongRangeFacetCounts`), where the user specifies all the ranges they want counts for, we count hits against those ranges, and support returning those counts through a couple APIs. Those faceting implementations allow ranges to be specified in a single dimension, and determine which ranges the document points (in one-dimensional space) fall in.
   
   So "hyperrectangle faceting"—in my original thinking at least—is just a generalization of this to multiple dimensions. The points associated with the documents are in n-dimensional space, and the user specifies the different "hyperrectangles" they want counts for by providing a [min, max] range in each dimension. For cases like the "automotive parts finder" example, it's perfectly valid for the "hyperrectangles" provided by the user to also be single points (where the min/max are equivalent values in each dimension). But it's also valid to mix-and-match, where some dimensions are single points and some are ranges (e.g., "all auto parts that fit 'Chevy' (single point) for the years 2000 - 2010 (range)).
   
   In the situation where a user wants to "fix some dimension" and count over others, it can still be described as a set of "hyperrectangles," but where the specified ranges on some of the dimensions happen to be the same across all of them.
   
   So I'm not quite sure if what you're suggesting in the API is just syntactic sugar on top of this idea, or if we're possibly talking about different things here? I'll try to dive into your suggestion more though and understand. I feel like I'm just missing something important and need to catch up on your thinking. Thanks again for sharing! I'll circle back in a few days when I've (hopefully) had some more time to spend on this :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org