You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/12/21 08:15:18 UTC

[GitHub] [lucene] gf2121 opened a new issue, #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

gf2121 opened a new issue, #12028:
URL: https://github.com/apache/lucene/issues/12028

   ### Description
   
   Today `TermInSetQuery` can be rewritten to disjunction BooleanQuery to lazily materialize query result if terms count < 16. This can significantly improve query performance in cases like `selective_clause AND low_cardinality_field in (xxx) `.
   
    Recently we added IntField, LongField, FloatField, DoubleField to index both with points and doc values (https://github.com/apache/lucene/issues/11199). `xxxField#newExactQuery` now can take advantage of `IndexOrDocValuesQuery` to match with DocValues when there is a selective conjunction clause. I wonder if we can have `xxxField#newSetQuery` that generates disjunction BooleanQuery when points count < 16 ? 
    
    For example:
    
    ```
   public static Query newSetQuery(String field, long... values) {
     if (values.length < 16) {
       BooleanQuery.Builder builder = new BooleanQuery.Builder();
       for (long value: values) {
         builder.add(newExactQuery(field, value), Occur.FILTER);
       }
       return builder.build();
     }
     return LongPoint.newSetQuery(field, values);
   }
    ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gf2121 commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

Posted by GitBox <gi...@apache.org>.
gf2121 commented on issue #12028:
URL: https://github.com/apache/lucene/issues/12028#issuecomment-1361021593

   I benchmarked some queries like `_id = '1' AND cardinality_8_field in (1, 2, 3) ` on 1M docs, here is the result:
   ```
   Benchmark                           Mode  Cnt   Score    Error   Units
   fieldSetQuery  thrpt   10  48.025 ± 16.741  ops/ms
   pointSetQuery  thrpt   10   5.514 ±  0.159  ops/ms
   ```
   `fieldSetQuery` is using `LongField#newSetQuery` (see the example above) while `pointSetQuery` is using `LongPoint#newSetQuery`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gf2121 closed issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

Posted by "gf2121 (via GitHub)" <gi...@apache.org>.
gf2121 closed issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField
URL: https://github.com/apache/lucene/issues/12028


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

Posted by GitBox <gi...@apache.org>.
rmuir commented on issue #12028:
URL: https://github.com/apache/lucene/issues/12028#issuecomment-1382953573

   I don't think it is good to degrade to `BooleanQuery` when using points or doc-values, it will only hurt performance.
   
   Let's add `NumericDocValuesField.newSlowSetQuery()` and `SortedNumericDocValuesField.newSlowSetQuery()` to complement the doc-values based range queries?
   
   Query in fact already exist, but needs to be cleaned up since they have been "hiding" in `lucene/sandbox`.  See PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] javanna commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

Posted by "javanna (via GitHub)" <gi...@apache.org>.
javanna commented on issue #12028:
URL: https://github.com/apache/lucene/issues/12028#issuecomment-1411142472

   Looks like this issue is addressed with the PR above? Can we close it or is there anything left to do that I am missing?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gf2121 commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

Posted by "gf2121 (via GitHub)" <gi...@apache.org>.
gf2121 commented on issue #12028:
URL: https://github.com/apache/lucene/issues/12028#issuecomment-1436240116

   Thanks Robert, nice approach!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org