You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2020/07/08 23:00:01 UTC

[GitHub] [hive] vineetgarg02 opened a new pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

vineetgarg02 opened a new pull request #1231:
URL: https://github.com/apache/hive/pull/1231


   …at task
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-XXXXX: Fix a typo in YYY)
   For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vineetgarg02 commented on a change in pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
vineetgarg02 commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454511428



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
       // and grand child
       if (found) {
         Operator<? extends OperatorDesc> rsParent = rsToRemove.getParentOperators().get(0);
-        Operator<? extends OperatorDesc> rsChild = rsToRemove.getChildOperators().get(0);
-        Operator<? extends OperatorDesc> rsGrandChild = rsChild.getChildOperators().get(0);
-
-        if (rsChild instanceof SelectOperator) {
-          // if schema size cannot be matched, then it could be because of constant folding
-          // converting partition column expression to constant expression. The constant
-          // expression will then get pruned by column pruner since it will not reference to
-          // any columns.
-          if (rsParent.getSchema().getSignature().size() !=
-              rsChild.getSchema().getSignature().size()) {
+        List<Operator<? extends OperatorDesc>> rsChildren = rsToRemove.getChildOperators();
+
+        Operator<? extends OperatorDesc> rsChildToRemove = null;
+
+        for (Operator<? extends OperatorDesc> rsChild : rsChildren) {

Review comment:
       @jcamachor I have addressed in latest comment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vineetgarg02 commented on a change in pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
vineetgarg02 commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454511428



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
       // and grand child
       if (found) {
         Operator<? extends OperatorDesc> rsParent = rsToRemove.getParentOperators().get(0);
-        Operator<? extends OperatorDesc> rsChild = rsToRemove.getChildOperators().get(0);
-        Operator<? extends OperatorDesc> rsGrandChild = rsChild.getChildOperators().get(0);
-
-        if (rsChild instanceof SelectOperator) {
-          // if schema size cannot be matched, then it could be because of constant folding
-          // converting partition column expression to constant expression. The constant
-          // expression will then get pruned by column pruner since it will not reference to
-          // any columns.
-          if (rsParent.getSchema().getSignature().size() !=
-              rsChild.getSchema().getSignature().size()) {
+        List<Operator<? extends OperatorDesc>> rsChildren = rsToRemove.getChildOperators();
+
+        Operator<? extends OperatorDesc> rsChildToRemove = null;
+
+        for (Operator<? extends OperatorDesc> rsChild : rsChildren) {

Review comment:
       @jcamachor I have addressed in latest commit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vineetgarg02 commented on a change in pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
vineetgarg02 commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454462383



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
       // and grand child
       if (found) {
         Operator<? extends OperatorDesc> rsParent = rsToRemove.getParentOperators().get(0);
-        Operator<? extends OperatorDesc> rsChild = rsToRemove.getChildOperators().get(0);
-        Operator<? extends OperatorDesc> rsGrandChild = rsChild.getChildOperators().get(0);
-
-        if (rsChild instanceof SelectOperator) {
-          // if schema size cannot be matched, then it could be because of constant folding
-          // converting partition column expression to constant expression. The constant
-          // expression will then get pruned by column pruner since it will not reference to
-          // any columns.
-          if (rsParent.getSchema().getSignature().size() !=
-              rsChild.getSchema().getSignature().size()) {
+        List<Operator<? extends OperatorDesc>> rsChildren = rsToRemove.getChildOperators();
+
+        Operator<? extends OperatorDesc> rsChildToRemove = null;
+
+        for (Operator<? extends OperatorDesc> rsChild : rsChildren) {

Review comment:
       I assumed that this could be possibility and therefore accounted for it, but if this assumption is wrong I update the code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jcamachor commented on a change in pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
jcamachor commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454092186



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
       // and grand child
       if (found) {
         Operator<? extends OperatorDesc> rsParent = rsToRemove.getParentOperators().get(0);
-        Operator<? extends OperatorDesc> rsChild = rsToRemove.getChildOperators().get(0);
-        Operator<? extends OperatorDesc> rsGrandChild = rsChild.getChildOperators().get(0);
-
-        if (rsChild instanceof SelectOperator) {
-          // if schema size cannot be matched, then it could be because of constant folding
-          // converting partition column expression to constant expression. The constant
-          // expression will then get pruned by column pruner since it will not reference to
-          // any columns.
-          if (rsParent.getSchema().getSignature().size() !=
-              rsChild.getSchema().getSignature().size()) {
+        List<Operator<? extends OperatorDesc>> rsChildren = rsToRemove.getChildOperators();
+
+        Operator<? extends OperatorDesc> rsChildToRemove = null;
+
+        for (Operator<? extends OperatorDesc> rsChild : rsChildren) {

Review comment:
       In which case would we have a RS with multiple children? Can we leave a comment explaining it? Otherwise, we should add a Precondition with number of children 1.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vineetgarg02 commented on a change in pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
vineetgarg02 commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454645150



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
       // and grand child
       if (found) {
         Operator<? extends OperatorDesc> rsParent = rsToRemove.getParentOperators().get(0);
-        Operator<? extends OperatorDesc> rsChild = rsToRemove.getChildOperators().get(0);
-        Operator<? extends OperatorDesc> rsGrandChild = rsChild.getChildOperators().get(0);
-
-        if (rsChild instanceof SelectOperator) {
-          // if schema size cannot be matched, then it could be because of constant folding
-          // converting partition column expression to constant expression. The constant
-          // expression will then get pruned by column pruner since it will not reference to
-          // any columns.
-          if (rsParent.getSchema().getSignature().size() !=
-              rsChild.getSchema().getSignature().size()) {
+        List<Operator<? extends OperatorDesc>> rsChildren = rsToRemove.getChildOperators();
+
+        Operator<? extends OperatorDesc> rsChildToRemove = null;
+
+        for (Operator<? extends OperatorDesc> rsChild : rsChildren) {

Review comment:
       @jcamachor all tests passed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vineetgarg02 merged pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
vineetgarg02 merged pull request #1231:
URL: https://github.com/apache/hive/pull/1231


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jcamachor commented on a change in pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
jcamachor commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454464296



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
       // and grand child
       if (found) {
         Operator<? extends OperatorDesc> rsParent = rsToRemove.getParentOperators().get(0);
-        Operator<? extends OperatorDesc> rsChild = rsToRemove.getChildOperators().get(0);
-        Operator<? extends OperatorDesc> rsGrandChild = rsChild.getChildOperators().get(0);
-
-        if (rsChild instanceof SelectOperator) {
-          // if schema size cannot be matched, then it could be because of constant folding
-          // converting partition column expression to constant expression. The constant
-          // expression will then get pruned by column pruner since it will not reference to
-          // any columns.
-          if (rsParent.getSchema().getSignature().size() !=
-              rsChild.getSchema().getSignature().size()) {
+        List<Operator<? extends OperatorDesc>> rsChildren = rsToRemove.getChildOperators();
+
+        Operator<? extends OperatorDesc> rsChildToRemove = null;
+
+        for (Operator<? extends OperatorDesc> rsChild : rsChildren) {

Review comment:
       Yes, there should not be a RS with multiple children, we can simplify that code. You can even add an assert to the new code to make sure.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jcamachor commented on a change in pull request #1231: HIVE-23822 Sorted dynamic partition optimization could remove auto stat task

Posted by GitBox <gi...@apache.org>.
jcamachor commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454092186



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
       // and grand child
       if (found) {
         Operator<? extends OperatorDesc> rsParent = rsToRemove.getParentOperators().get(0);
-        Operator<? extends OperatorDesc> rsChild = rsToRemove.getChildOperators().get(0);
-        Operator<? extends OperatorDesc> rsGrandChild = rsChild.getChildOperators().get(0);
-
-        if (rsChild instanceof SelectOperator) {
-          // if schema size cannot be matched, then it could be because of constant folding
-          // converting partition column expression to constant expression. The constant
-          // expression will then get pruned by column pruner since it will not reference to
-          // any columns.
-          if (rsParent.getSchema().getSignature().size() !=
-              rsChild.getSchema().getSignature().size()) {
+        List<Operator<? extends OperatorDesc>> rsChildren = rsToRemove.getChildOperators();
+
+        Operator<? extends OperatorDesc> rsChildToRemove = null;
+
+        for (Operator<? extends OperatorDesc> rsChild : rsChildren) {

Review comment:
       In which case would we have a RS with multiple children? I thought this would never happen. Can we leave a comment explaining it? Otherwise, we should add a Precondition with number of children 1.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org