You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/09/15 13:15:00 UTC

[GitHub] [pinot] priyen opened a new issue, #9402: [Bug] Full scan is happening on all docs instead of just those returned by the inverted index

priyen opened a new issue, #9402:
URL: https://github.com/apache/pinot/issues/9402

   We have this query, ```
   SELECT
     "customer_id"
   FROM
     "table name"
   WHERE
     "_viewing_merchant" = 'some merchant id'
     AND ((
   	"customer_id" < 'some customer id'
   	AND "last_payment" = 1661489443000.0
       )
   	OR "last_payment" < 1661489443000.0
     )```
   and this is the explain on it
   ```
   BROKER_REDUCE(sort:[last_payment DESC, customer_id DESC],limit:21)	0	-1
   COMBINE_SELECT_ORDERBY	1	0
   SELECT_ORDERBY(selectList:last_payment, customer_id)	2	1
   TRANSFORM_PASSTHROUGH(customer_id, last_payment)	3	2
   PROJECT(last_payment, customer_id)	4	3
   FILTER_AND	5	4
   FILTER_INVERTED_INDEX(indexLookUp:inverted_index,operator:EQ,predicate:_viewing_merchant = 'some merchant id')	6	5
   FILTER_OR	7	5
   FILTER_AND	8	7
   FILTER_FULL_SCAN(operator:RANGE,predicate:customer_id < 'xxxxxxx')	9	8
   FILTER_FULL_SCAN(operator:EQ,predicate:last_payment = '1661489443000')	10	8
   FILTER_FULL_SCAN(operator:RANGE,predicate:last_payment < '1661489443000')	11	7
   ```
   with the  last_payment = clause, the result has `numEntriesScannedInFilter=42846997`, and if we remove it, then `numEntriesScannedInFilter=554`  and it finishes in ms instead of hundreds of ms.
   Seems like we are doing full scan on all data instead of just the one returned by the inverted index. 
   
   The performance diff is from hundreds of ms to just ms if this is resolved 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on issue #9402: [Bug] Full scan is happening on all docs instead of just those returned by the inverted index

Posted by GitBox <gi...@apache.org>.
richardstartin commented on issue #9402:
URL: https://github.com/apache/pinot/issues/9402#issuecomment-1248355885

   Linking #7597


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on issue #9402: [Bug] Full scan is happening on all docs instead of just those returned by the inverted index

Posted by GitBox <gi...@apache.org>.
kishoreg commented on issue #9402:
URL: https://github.com/apache/pinot/issues/9402#issuecomment-1248117288

   You have an OR clause on last payment which will result in full scan.. add range index on last payment and this will be very fast


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on issue #9402: [Bug] Full scan is happening on all docs instead of just those returned by the inverted index

Posted by GitBox <gi...@apache.org>.
richardstartin commented on issue #9402:
URL: https://github.com/apache/pinot/issues/9402#issuecomment-1248354947

   Note that the data structure backing the range index supports passing a bitmap which represents filtering already done, but isn't used in Pinot. If filter bitmaps were passed on to subsequent filters, this feature of the data structure could be used. Generally, filters are executed independently in pinot and intersected afterwards. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org