You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/07/09 04:34:17 UTC

[GitHub] [incubator-pinot] fx19880617 opened a new pull request #5671: Rewrite non-aggregate group by query to distinct query

fx19880617 opened a new pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671


   ## Description
   Per https://github.com/apache/incubator-pinot/issues/5663, this PR rewrite non-aggregation groupBy query to distinct query.
   e.g. 
   `SELECT col1+col2*5 FROM foo GROUP BY col1, col2` => `SELECT distinct col1+col2*5 FROM foo`
   `SELECT col1, col2 FROM foo GROUP BY col1, col2` => `SELECT distinct col1, col2 FROM foo`
   
   
   ## Release Notes
   
   Rewrite non-aggregation group by query to distinct query.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#discussion_r452572766



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/DistinctQueriesTest.java
##########
@@ -347,19 +389,16 @@ private DistinctTable getDistinctTableInnerSegment(String query) {
    *   </li>
    * </ul>
    */
-  @Test
-  public void testDistinctInterSegment()
+  public void testDistinctInterSegmentHelper(String[] pqlQueries, String[] sqlQueries)

Review comment:
       private?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia edited a comment on pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
siddharthteotia edited a comment on pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#issuecomment-656132865


   For completeness, it would be nice to add tests to DistinctQueriesTest file as well.  It has both single column, multi column along with transform function tests for DISTINCT for both intra and inter segment. 
   
   select distinct a from foo
   select distinct a, b from foo
   select distinct add(a, b) from foo
   
   The above are already there. We should add for query rewrite for both identifiers and expressions. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 commented on pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
fx19880617 commented on pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#issuecomment-656365240


   > DistinctQueriesTest
   
   
   
   > For completeness, it would be nice to add tests to DistinctQueriesTest file as well. It has both single column, multi column along with transform function tests for DISTINCT for both intra and inter segment.
   > 
   > select distinct a from foo
   > select distinct a, b from foo
   > select distinct add(a, b) from foo
   > 
   > The above are already there. We should add for query rewrite for both identifiers and expressions.
   
   Added.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
mayankshriv commented on a change in pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#discussion_r452327225



##########
File path: pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
##########
@@ -331,12 +345,59 @@ private static void queryRewrite(PinotQuery pinotQuery) {
       pinotQuery.setFilterExpression(updatedFilterExpression);
     }
 
+    // Rewrite GroupBy to Distinct
+    rewriteNonAggregationGroupByToDistinct(pinotQuery);
+
     // Update alias
     Map<Identifier, Expression> aliasMap = extractAlias(pinotQuery.getSelectList());
     applyAlias(aliasMap, pinotQuery);
     validate(aliasMap, pinotQuery);
   }
 
+  private static void rewriteNonAggregationGroupByToDistinct(PinotQuery pinotQuery) {

Review comment:
       Would be good to add doc with sample query translations (as already stated in the PR description).

##########
File path: pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
##########
@@ -196,13 +198,25 @@ private static boolean isAggregateExpression(Expression expression) {
     return false;
   }
 
-  public static Set<String> extractIdentifiers(List<Expression> expressions) {
+  /**
+   * Extract all the identifiers from given expressions.
+   *
+   * @param expressions
+   * @param excludeAs if true, ignores the right side identifier for AS function.
+   * @return all the identifier names.
+   */
+  public static Set<String> extractIdentifiers(List<Expression> expressions, boolean excludeAs) {

Review comment:
       May be a non-issue here, but just wanted to point out that there are aggregation functions that take multiple args, some of which are expected to be literals (even though they look like expressions, eg distinctCountThetaSketch).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#discussion_r452572531



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/DistinctQueriesTest.java
##########
@@ -179,18 +179,15 @@ private ImmutableSegment createSegment(int index, List<GenericRow> records)
    *   <li>Selecting some columns with filter that does not match any record</li>
    * </ul>
    */
-  @Test
-  public void testDistinctInnerSegment()
+  public void testDistinctInnerSegmentHelper(String[] queries, boolean isPql)

Review comment:
       private?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 merged pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
fx19880617 merged pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#issuecomment-656138929


   Not related to this PR, but when I had added DISTINCT, I don't think alias support was there. So alias query tests are not there for DISTINCT. You might want to add them or I can do in a follow-up. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 commented on pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
fx19880617 commented on pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#issuecomment-656267927


   > Not related to this PR, but when I had added DISTINCT, I don't think alias support was there. So alias query tests are not there for DISTINCT. You might want to add them or I can do in a follow-up.
   
   True alias support require more on the distinct side, which is not addressed in this PR.
   From sql side, this is typically expressed in sub-queries.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 edited a comment on pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
fx19880617 edited a comment on pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#issuecomment-656267927


   > Not related to this PR, but when I had added DISTINCT, I don't think alias support was there. So alias query tests are not there for DISTINCT. You might want to add them or I can do in a follow-up.
   
   True alias support require more on the distinct side, which is not addressed in this PR.
   From sql side, this is typically expressed in sub-queries.
   
   Alias supports may need to be pushed down to distinct function.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
fx19880617 commented on a change in pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#discussion_r452391596



##########
File path: pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
##########
@@ -196,13 +198,25 @@ private static boolean isAggregateExpression(Expression expression) {
     return false;
   }
 
-  public static Set<String> extractIdentifiers(List<Expression> expressions) {
+  /**
+   * Extract all the identifiers from given expressions.
+   *
+   * @param expressions
+   * @param excludeAs if true, ignores the right side identifier for AS function.
+   * @return all the identifier names.
+   */
+  public static Set<String> extractIdentifiers(List<Expression> expressions, boolean excludeAs) {

Review comment:
       I see, if those args are single-quoted then it should be fine .

##########
File path: pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
##########
@@ -331,12 +345,59 @@ private static void queryRewrite(PinotQuery pinotQuery) {
       pinotQuery.setFilterExpression(updatedFilterExpression);
     }
 
+    // Rewrite GroupBy to Distinct
+    rewriteNonAggregationGroupByToDistinct(pinotQuery);
+
     // Update alias
     Map<Identifier, Expression> aliasMap = extractAlias(pinotQuery.getSelectList());
     applyAlias(aliasMap, pinotQuery);
     validate(aliasMap, pinotQuery);
   }
 
+  private static void rewriteNonAggregationGroupByToDistinct(PinotQuery pinotQuery) {

Review comment:
       will do 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on pull request #5671: Rewrite non-aggregate group by query to distinct query

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #5671:
URL: https://github.com/apache/incubator-pinot/pull/5671#issuecomment-656132865


   For completeness, it would be nice to add tests to DistinctQueriesTest file as well.  It has both single column, multi column along with transform function tests for DISTINCT.
   
   select distinct a from foo
   select distinct a, b from foo
   select distinct add(a, b) from foo
   
   The above are already there. We should add for query rewrite for both identifiers and expressions. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org