You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Greg Miller (Jira)" <ji...@apache.org> on 2021/08/18 23:34:00 UTC

[jira] [Created] (LUCENE-10056) Support multiple ToParentBlockJoinQuery instances in a BooleanQuery with Occur == MUST/FILTER

Greg Miller created LUCENE-10056:
------------------------------------

             Summary: Support multiple ToParentBlockJoinQuery instances in a BooleanQuery with Occur == MUST/FILTER
                 Key: LUCENE-10056
                 URL: https://issues.apache.org/jira/browse/LUCENE-10056
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/join
            Reporter: Greg Miller


As a user of {{ToParentBlockJoinQuery}} (TPBJQ), we've run into some fairly tricky cases where we'd like to include multiple instances in a {{BooleanQuery}} tree that are all required (i.e., MUST/FILTER). The only way to get this to work properly today is to ensure all constraints are packed into a single TPBJQ, otherwise it may produce incorrect matches.

To illustrate with an example, consider child documents that contain a "price" (numeric value) and "condition" ("new" vs. "used") field. Consider constructing a query that matches these child documents if "price < 100" and "condition == new". If one TPBJQ is constructed with a {{childQuery}} modeling both constraints, this works just fine and will provide parent docs that contain at least one child matching both constraints. But, if we were to create two separate TPBJQ instances, one for "price" and one for "condition" and then try to conjunctively join those two instances, we'd get parent documents that contained at least one "condition == new" child and one "price < 100" child but those might occur on different children and the parent may actually be a false match.

While there's nothing inherently defective with TPBJQ, this feels a bit trappy and easy to get wrong. Even trickier, there are cases where it's not possible to pack all conjunctive child constraints into a single TPBJQ (e.g., using {{DrillSideways}} faceting where the query needs to be iteratively/partially evaluated).

It would be nice to come up with a solution that allows multiple TPBJQ instances to exist in a BooleanQuery, maybe doing some later-stage filtering to ensure the parents are actually proper matches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org