You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/12/02 18:48:27 UTC

[GitHub] [hive] asolimando opened a new pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

asolimando opened a new pull request #2839:
URL: https://github.com/apache/hive/pull/2839


   …sposeRule if predicate has no InputRef
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Fixing an exception when the predicate to be transposed has no inputref.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Otherwise some queries will fail.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description, screenshot and/or a reproducable example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Hive versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   No, unless the user query were suffering from the problem.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   `mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile="cbo_filter_proj_transpose_noinputref.q" -pl itests/qtest -Pitests`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761820492



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {

Review comment:
       there are a couple of ways to fix things like this:
   * we could call `RexSimplify` everywhere - but I don't really like that idea as it exposes unneccessary complexity - at places where a higher level objective is targeted
   * use the `RelBuilder` which is usually calling the simplifier here and there
   * use a rule which runs simplifiaction - it could be that these rules should be accompanied by `ReduceExpressionsRule` ?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] asolimando commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761801075



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {

Review comment:
       This method is invoked only for `IS NOT NULL` calls, so we can't have other kind of calls here.
   
   This said, I agree about trying to use more `RexSimplify` with `uAF` mode when we create/modify predicates in our CBO rules, https://github.com/apache/hive/pull/2840 is heading into that direction for`HiveJoinPushTransitivePredicatesRule`, for instance.
   
   It's probably a good idea to open a separate Jira ticket for revising our normalization/simplification strategy in CBO rules.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] asolimando commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761765160



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {

Review comment:
       I am tackling the use of `RexSimplify` in this rule in this other PR: https://github.com/apache/hive/pull/2840
   The majority of the test diff are (expected) changes in the plan, but there are also few real failures I need to investigate, but you can check if what is done there matches what you have in mind.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk merged pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
kgyrtkirk merged pull request #2839:
URL: https://github.com/apache/hive/pull/2839


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761827358



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {
+        return;

Review comment:
       this rule doesn't look right to me right now with this `isRedundantIsNotNull` call which:
   * walks the entire rel tree
   * invokes simplification on filter nodes
   * unboxes HepRelVertex stuff
   
   can we avoid the original issue of HIVE-25275 - by adding a rule which targets expression simplication (reduceexpressionrule?) to the same rule-group?
   
   can you open a ticket to investiage this?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] asolimando commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761801075



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {

Review comment:
       This method is invoked only for `IS NOT NULL` calls, so we can't have other kind of calls here.
   
   This said, I agree about trying to use more `RexSimplify` with `uAF` mode when we create/modify predicates in our CBO rules, https://github.com/apache/hive/pull/2840 is heading into that direction for`HiveJoinPushTransitivePredicatesRule`, for instance.
   
   In that case we end up in a loop pulling and pushing down the same predicate all over again, which can be avoided with simplification. On top of that, we might even unlock more optimization opportunities if we have consistent simplification/normalization. In the past I have seen that some Hive rules taken from Calcite are missing `RexSimplify.simplify` where in Calcite this has been added later, it's a pity I can't find the example right now.
   
   In summary, I agree to open a separate Jira ticket for revising our normalization/simplification strategy in CBO rules.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] commented on pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #2839:
URL: https://github.com/apache/hive/pull/2839#issuecomment-1027414680


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761710984



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {

Review comment:
       this is interesting...I think we shouldn't have the  `IS NOT NULL` at all.... why don't we run `RexSimplify` in `unknownAsFalse` mode?
   
   we may make that change independently from this change - but `uAF` could probably identify more opportunities in case the outer is not an `IS NOT NULL`

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out

Review comment:
       the above code blindly states that the top level call is an `IS NOT NULL` without even checking it
   can we be sure in that? because it seems like it removes the top level call without even a check!
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] asolimando commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761830232



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {

Review comment:
       For the `RelBuilder` use I think [HiveFilterMergeRule](https://github.com/apache/hive/blob/cadf5f731c8cb0a4217d50de8a9b216eaaed1ea1/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterMergeRule.java#L54) is a good example, and it hides all the complexity.
   
   However, when dealing directly with newly created `RexNode`s I am not sure how to achieve that consistently.
   
   Regarding option 3, it seems that at least in the case reported here it was not enough: https://github.com/apache/hive/pull/2840. My impression is that, if we don't simplify before hand, in some cases, `RexSimplify.simplify` is not always able to perform the simplification afterwards, but I need to have a better look.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] asolimando commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r762066925



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out
+      if (inputRefs.isEmpty()) {
+        return;

Review comment:
       As discussed offline, `HiveReduceExpressionsRule` is applied already but it cannot simplify those cases, for instance:
   ```
   [ReduceExpressionsRule(Filter)] to [rel#75:HiveFilter.HIVE.[].any(input=HepRelVertex#21,condition=AND(IS NOT NULL($0), IS NOT NULL(CAST(TRUNC($0, _UTF-16LE'MONTH':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):DATE), IS NOT NULL(CAST(TRUNC(CAST(TRUNC($0, _UTF-16LE'MONTH':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):DATE, _UTF-16LE'MONTH':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):DATE)))]
   ```
   does not get simplified and the expression keeps growing.
   
   [HIVE-25758](https://issues.apache.org/jira/browse/HIVE-25758)) is affected by the same issue, but since the expression has no input ref in it, this fix does not help. Once we get a more general solution for this "loop" that is created, we will probably be able to remove or scope down what's provided for [HIVE-25275](https://issues.apache.org/jira/browse/HIVE-25275).
   
   Since the tickets are linked in Jira, I think we are good for now, we can revisit it later upon finding a solution to the general problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] asolimando commented on a change in pull request #2839: HIVE-25766: java.util.NoSuchElementException in HiveFilterProjectTran…

Posted by GitBox <gi...@apache.org>.
asolimando commented on a change in pull request #2839:
URL: https://github.com/apache/hive/pull/2839#discussion_r761798081



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java
##########
@@ -339,6 +339,11 @@ private void check(Filter filter) {
       final RexNode filterCondition = simplify.simplify(filter.getCondition());
 
       final Set<Integer> inputRefs = HiveCalciteUtil.getInputRefs(newCondition);
+      // if the new IS NOT NULL has no input ref, there is redundancy here, bail out

Review comment:
       This auxiliary method is only called in the context of this method, which only targets `IS NOT NULL` calls: [HiveFilterProjectTransposeRule.java#L265](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java#L265)
   
   The `check` method does not remove any call by itself, it is just in charge to check if the new `IS NOT NULL` that we want to create would not be redundant, if so it prevents the push by making the rule bail out.
   
   I agree that the name of the method is not ideal, maybe we can rename it as `checkRedundancyOfIsNotNull` or something along that line?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org