You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/21 05:47:17 UTC

[GitHub] [arrow-datafusion] jackwener opened a new issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

jackwener opened a new issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   ```sql
   explain verbose select max(distinct(c1)) from test;
   
   +-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type                                             | plan                                                                                                                                        |
   +-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+
   | initial_logical_plan                                  | Projection: #MAX(DISTINCT test.c1)                                                                                                          |
   |                                                       |   Aggregate: groupBy=[[]], aggr=[[MAX(DISTINCT #test.c1)]]                                                                                  |
   |                                                       |     TableScan: test projection=None                                                                                                         |
   | logical_plan after simplify_expressions               | SAME TEXT AS ABOVE                                                                                                                          |
   | logical_plan after eliminate_filter                   | SAME TEXT AS ABOVE                                                                                                                          |
   | logical_plan after common_sub_expression_eliminate    | SAME TEXT AS ABOVE                                                                                                                          |
   | logical_plan after eliminate_limit                    | SAME TEXT AS ABOVE                                                                                                                          |
   | logical_plan after projection_push_down               | Projection: #MAX(DISTINCT test.c1)                                                                                                          |
   |                                                       |   Aggregate: groupBy=[[]], aggr=[[MAX(DISTINCT #test.c1)]]                                                                                  |
   |                                                       |     TableScan: test projection=Some([0])                                                                                                    |
   | logical_plan after filter_push_down                   | SAME TEXT AS ABOVE                                                                                                                          |
   | logical_plan after limit_push_down                    | SAME TEXT AS ABOVE                                                                                                                          |
   | logical_plan after SingleDistinctAggregationToGroupBy | Projection: #MAX(DISTINCT test.c1)                                                                                                          |
   |                                                       |   Projection: #MAX(alias1) AS MAX(DISTINCT test.c1)                                                                                         |
   |                                                       |     Aggregate: groupBy=[[]], aggr=[[MAX(#alias1)]]                                                                                          |
   |                                                       |       Aggregate: groupBy=[[#test.c1 AS alias1]], aggr=[[]]                                                                                  |
   |                                                       |         TableScan: test projection=Some([0]) 
   ```
   
   **Describe the solution you'd like**
   Find the problem in `SingleDistinctAggregationToGroupBy` rule and fix it.
   
   **Describe alternatives you've considered**
   `SingleDistinctAggregationToGroupBy` rule make the plan more complex.
   
   **Additional context**
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jackwener commented on issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

Posted by GitBox <gi...@apache.org>.
jackwener commented on issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045#issuecomment-1076018593


   Now the target is rewrite from
   
   ```sql
   | initial_logical_plan | Projection: #COUNT(DISTINCT test.c1)                     
   |                      |   Aggregate: groupBy=[[]], aggr=[[COUNT(DISTINCT #test.c1)]]                      
   |                      |     TableScan: test projection=None
   ```
   to
   ```sql
   | initial_logical_plan | Projection: #COUNT(DISTINCT test.c1)
   |                      |   Projection: #COUNT(#test.c1) AS COUNT(DISTINCT test.c1)                   
   |                      |     Aggregate: groupBy=[[]], aggr=[[COUNT(#test.c1)]]                
   |                      |       TableScan: test projection=None
   ```
   instead of 
   
   ```sql
   | logical_plan  | Projection: #COUNT(DISTINCT test.c1)          
   |               |   Projection: #COUNT(alias1) AS COUNT(DISTINCT test.c1)     
   |               |     Aggregate: groupBy=[[]], aggr=[[COUNT(#alias1)]]        
   |               |       Aggregate: groupBy=[[#test.c1 AS alias1]], aggr=[[]]                        
   |               |       Aggregate: groupBy=[[#test.c1 AS alias1]], aggr=[[]]  
   |               |         TableScan: test projection=Some([0])
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jiangzhx edited a comment on issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

Posted by GitBox <gi...@apache.org>.
jiangzhx edited a comment on issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045#issuecomment-1073654184


   talk with @jackwener and @ic4y ;
   look like max and min function can not use this optimizer.
   should add more test case
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jackwener commented on issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

Posted by GitBox <gi...@apache.org>.
jackwener commented on issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045#issuecomment-1073505736


   I am investigating this issue. If you know some information about it, you can provide it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jiangzhx commented on issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

Posted by GitBox <gi...@apache.org>.
jiangzhx commented on issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045#issuecomment-1073654184


   talk with @jackwener and @ic4y ;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jiangzhx commented on issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

Posted by GitBox <gi...@apache.org>.
jiangzhx commented on issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045#issuecomment-1073581065


   @jackwener  check this issues https://github.com/apache/arrow-datafusion/pull/1315
   
   @ic4y contribute this pr.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jackwener commented on issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

Posted by GitBox <gi...@apache.org>.
jackwener commented on issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045#issuecomment-1076016569


   During I rewrite the plan for eliminating the distinct, there are some problem I can't figure out. Need to wait to I learn more and then try again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jackwener edited a comment on issue #2045: `SingleDistinctAggregationToGroupBy` rule cause some strange change

Posted by GitBox <gi...@apache.org>.
jackwener edited a comment on issue #2045:
URL: https://github.com/apache/arrow-datafusion/issues/2045#issuecomment-1076018593


   Now the target is rewrite from
   
   ```sql
   | initial_logical_plan | Projection: #Max(DISTINCT test.c1)                     
   |                      |   Aggregate: groupBy=[[]], aggr=[[Max(DISTINCT #test.c1)]]                      
   |                      |     TableScan: test projection=None
   ```
   to
   ```sql
   | initial_logical_plan | Projection: #Max(DISTINCT test.c1)
   |                      |   Projection: #Max(#test.c1) AS Max(DISTINCT test.c1)                   
   |                      |     Aggregate: groupBy=[[]], aggr=[[Max(#test.c1)]]                
   |                      |       TableScan: test projection=None
   ```
   instead of 
   
   ```sql
   | logical_plan  | Projection: #Max(DISTINCT test.c1)          
   |               |   Projection: #Max(alias1) AS Max(DISTINCT test.c1)     
   |               |     Aggregate: groupBy=[[]], aggr=[[Max(#alias1)]]        
   |               |       Aggregate: groupBy=[[#test.c1 AS alias1]], aggr=[[]]                        
   |               |       Aggregate: groupBy=[[#test.c1 AS alias1]], aggr=[[]]  
   |               |         TableScan: test projection=Some([0])
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org