You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/04/28 09:44:27 UTC

[GitHub] [pinot] richardstartin opened a new issue, #8607: Explain plan can be misleading

richardstartin opened a new issue, #8607:
URL: https://github.com/apache/pinot/issues/8607

   I have a query as follows:
   
   ```sql
   select count(*) from githubEvents where dateTrunc('YEAR', event_time) = '2016-01-01 00:00:00.0'
   ```
   
   It produces a nonzero result:
   
   | count(*) |
   |--|
   |323650488|
   
   However, when I try an explain plan:
   
   
   ```sql
   explain plan for select count(*) from githubEvents where dateTrunc('YEAR', event_time) = '2016-01-01 00:00:00.0'
   ```
   
   The query plan picks a segment at random which should have been pruned, and doesn't reflect the way the query is evaluated:
   
   
   Operator | Operator_Id | Parent_Id
   -- | -- | --
   BROKER_REDUCE(limit:10) | 0 | -1
   COMBINE_AGGREGATE | 1 | 0
   FAST_FILTERED_COUNT | 2 | 1
   FILTER_EMPTY | 3 | 2
   
   It would be helpful if the plan chose a segment which has data, or queried all segments and merged operators when the operator varies according to segment.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] amrishlal commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
amrishlal commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1116735978

   If we want to be 100% accurate, then probably the only option is to move towards full evaluation of EXPLAIN PLAN over all segments on all servers.  To start with maybe we could evaluate EXPLAIN PLAN against one segment of each server and do broker reduce and dedup over all the servers. Later on a server combine could be added for evaluating against all segments? For now, although not ideal, the user can execute the EXPLAIN PLAN. a few times to get a better idea of the variation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1112790524

   cc @amrishlal 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1115489590

   Working on it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
richardstartin commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1115856106

   > One approach could be:
   > 
   > * Send query to all segments and all servers
   > * On the broker, pick the response with deepest tree and avoid merging across segments
   > 
   > I think this is fairly simple
   > 
   > While the above will address the general accuracy issues pointed out above associated with picking a random segment on a random server, there is still going to be some inaccuracy in the case where a server physical operator tree is indeed different for a segment. This can be done as another follow-up
   
   I think it's misleading to choose just one child, it's not a degenerate case for the to be a mix, where some segments can be skipped over based on metadata and others need to perform some real work. This can also happen if there has been a configuration change and the segments are in a mixed state.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1115493629

   One approach could be:
   
   - Send query to all segments and all servers
   - On the broker, pick the response with deepest tree and avoid merging across segments
   
   I think this is fairly simple
   
   While the above will address the general accuracy issues pointed out above associated with picking a random segment on a random server, there is still going to be some inaccuracy in the case where a server physical operator tree is indeed different for a segment. This can be done as another follow-up


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1122802400

   Change is in progress


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1134900784

   Fixed by https://github.com/apache/pinot/pull/8738


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia closed issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
siddharthteotia closed issue #8607: Explain plan can be misleading
URL: https://github.com/apache/pinot/issues/8607


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on issue #8607: Explain plan can be misleading

Posted by GitBox <gi...@apache.org>.
richardstartin commented on issue #8607:
URL: https://github.com/apache/pinot/issues/8607#issuecomment-1117032199

   I think the deepest child heuristic might be the right trade off between accuracy and whatever it is which prevents considering the entire query execution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org